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The  major  principal  direction  alignment  principle  is  investigated  in  detail  for  the 
case  when  the  maximum  singular  value  is  repeated.  A  first  result  is  a  new  proof  based  on 
duality  theory  for  the  necessary  and  sufficient  conditions  that  ensure  equality  of  the 
spectral  radius  and  maximum  singular  value  of  a  matrix;  namely,  that  there  must  exists  at 
least  one  aligned  pair  of  major  input-output  principal-direction  vectors.  A  second  result 
is  the  development  of  a  novel  numerical  optimization  algorithm  to  solve  the  optimal 
similarity-scaling  problem  that  yields  an  upper  bound  for  the  structured  singular  value. 
The  algorithm  provides  a  systematic  procedure  for  identifying  the  steepest-descent  search 
direction  even  for  the  case  when  the  singular  value  is  repeated  and  the  underlying 
optimization  problem  is  locally  nondifferentiable.  The  key  theoretical  element  is  the 
characterization  of  the  subdifferential  at  every  point  of  nondifferentiability. 
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The  critical-direction  theory  is  extended  to  include  nonconvex  critical  uncertainty 
value  sets  through  the  introduction  of  a  general  definition  of  the  critical  perturbation 
radius.  The  Nyquist  robust  stability  margin  is  calculated  for  systems  with  affine 
parametric  uncertainty  using  an  explicit  map  from  the  parameter  space  to  the  Nyquist 
plane.  A  practical  design  approach  based  on  parameter  space  methods  is  introduced. 
First  the  controller  parameters  that  result  in  robustly  stable  closed-loop  systems  are 
determined.  Then,  a  performance  objective  is  optimized  over  the  set  of  robustly 
stabilizing  controller  parameters,  resulting  in  a  robustly  stabilizing  controller  with  some 
optimal  performance  characteristics. 

A  formal  robustness  analysis  of  popular  proportional-integral  controller  tuning 
rules  for  systems  approximated  by  a  first-order-plus-time-delay  model  is  presented.  The 
uncertainty  in  the  process  model  is  represented  by  multiplicative  parametric  perturbations 
in  the  process  gain,  process  time  constant,  and  process  time-delay.  The  robustness  of  the 
uncertain  system  is  characterized  in  terms  of  the  set  of  all  perturbations  that  result  in 
stable  closed-loops.  This  set  is  used  to  calculate  the  standard  gain  and  phase  margins, 
and  the  parametric  stability  margin  which  is  a  metric  of  robustness  to  simultaneous 
variations  in  all  three  system  parameters.  These  margins  are  used  to  compare  the  relative 
robustness  properties  of  several  disturbance-rejection  and  tracking  tuning  rules  in 
widespread  use. 
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CHAPTER  1 
INTRODUCTION 

1.1.  Motivation 

Uncertainty  is  a  fact  of  any  real-world  system.  This  uncertainty  inherently 
translates  to  the  model  of  the  system  used  for  control  design,  and  is  most  often  in  the 
form  of  neglected  dynamics  or  variations  in  model  parameters.  An  important 
requirement  of  any  control  system  is  that  it  be  robust  (i.e.,  it  functions  satisfactorily  under 
these  uncertainties),  and  the  design  of  such  control  systems  is  known  as  Robust  Control. 
An  important  aspect  of  the  robust  control  problem  is  the  robust  analysis  problem  which  is 
determining  if  a  control  system  satisfies  stability  and  performance  requirements  given  an 
admissible  set  of  uncertainties. 

Robust  stability  is  obviously  a  necessary  requirement  for  robustness,  and  has  been 
studied  since  the  earliest  days  of  feedback  control,  which  originated  to  desensitize  control 
systems  to  changes  in  the  process  as  well  as  stabilize  unstable  systems.  The  classical 
design  techniques  focused  on  frequency  domain  methods  such  as  those  based  on  Bode 
plots  and  Nyquist  plots  (Nyquist,  1932)  and  resulted  in  the  gain  and  phase  stability 
margins.  With  the  advent  of  the  space  race  of  the  1960's,  the  focus  of  control  engineers 
was  shifted  away  from  frequency  domain  robust  stability  methods  to  the  field  of  optimal 
control.  In  fact,  the  linear  quadratic  regulator  (LQG)  design  appeared  to  give  controllers 
with  good  stability  properties,  but  in  the  late  1970's  it  was  found  that  LQG  and  other 
prevailing  methods  of  control  design  such  as  state  feedback  through  observers  lost  their 
stability  guarantees  under  uncertainty. 
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As  a  result,  Hm  optimal  control  (Zames,  1981;  Francis  and  Zames,  1984;  Doyle, 
1983;  Safonov  and  Verma,  1985;  Doyle  et  al,  1989)  was  introduced  as  a  framework  to 
effectively  deal  with  robust  stability  and  performance  problems.  The  theory  provides  a 
precise  formulation  and  solution  to  the  problem  of  synthesizing  an  output  feedback 
compensator  that  minimizes  the  Hx  norm  of  a  prescribed  system  transfer  function.  The 
method  considers  unstructured  uncertainties  where  the  only  information  known  about  the 
uncertainty  is  a  norm  bound.  Typically,  more  information  about  the  uncertainty  is  known 
than  a  simple  norm  bound.  As  a  consequence,  several  robust  analysis  methods  have  been 
developed  to  consider  these  structure  uncertainties. 

Possibly  the  most  effective  and  comprehensive  result  is  the  structured  singular 
value  (//)  analysis  method  introduced  by  Doyle  (1982),  which  considers  the  problem  of 
robust  stability  for  a  known  plant  subject  to  a  block-diagonal  uncertainty  structure  under 
feedback.  In  general,  any  block-diagram  interconnection  of  systems  and  uncertainties 
can  be  rearranged  into  the  block-diagonal  standard  form.  The  value  of  /u  corresponds  to 
the  smallest  uncertainty  that  will  destabilize  the  system.  Unfortunately,  calculating  ju  is 
not  trivial;  in  fact  the  underlying  optimization  problem  has  been  proven  to  be  NP-hard 
(Braatz  et  al.,  1994).  However,  there  is  a  convex  optimization  that  gives  a  conservative 
upper  bound  for  fj. .  In  addressing  the  existence  of  solutions  to  the  proposed  convex 
optimization,  Kouvaritakis  and  Latchman  introduce  the  major  principal  direction 
alignment  (MPDA)  property  (1985),  which  gives  necessary  and  sufficient  conditions  for 
ju  to  equal  its  upper  bound,  thus  eliminating  the  conservatism. 

Another  robust  stability  analysis  method  for  structure  uncertainties  is  the  critical- 
direction  theory  developed  by  Latchman  and  Crisalle  (1995)  and  Latchman  et  al.  (1997) 


which  addresses  the  problem  of  robust  stability  of  systems  affected  by  uncertainties  that 
are  characterized  in  terms  of  arbitrary  frequency-domain  value  sets  that  are  convex.  The 
critical  direction  theory  proposes  the  Nyquist  robust  stability  margin  as  a  measure  of 
robust  stability  which  has  obvious  connections  to  the  Nyquist  stability  criteria.  The 
advantage  of  the  critical  direction  theory  over  the  structured  singular  value  theory  is  that 
for  several  common  structured  uncertainty  types  there  is  an  analytical  expression  for  the 
Nyquist  robust  stability  margin.  Also,  even  if  there  is  not  an  analytical  expression, 
determining  the  Nyquist  robust  stability  margin  is  a  tractable  problem. 

Another  type  of  structured  uncertainty  is  real  parametric  uncertainty  in  the 
process  model.  The  robust  stability  problem  under  parametric  uncertainty  began  to 
receive  renewed  attention  with  the  seminal  result  of  Kharitonov  (1979)  on  the  stability  of 
interval  polynomials,  and  is  considered  the  most  important  development  in  the  area  after 
the  Routh-Hurwitz  criterion.  The  theory  makes  it  possible  to  determine  if  a  linear  time 
invariant  control  system,  containing  several  uncertain  real  parameters  remains  stable  as 
the  parameters  vary  over  a  set  (Bhattacharyya  et  al,  1995).  Accordingly,  the  parametric 
stability  margin  is  defined  as  the  length  of  the  smallest  perturbation  in  the  parameters 
which  destabilizes  the  closed  loop.  The  parametric  stability  margin  is  useful  in  controller 
design  as  a  means  of  comparing  the  performance  of  proposed  controllers. 
1.2.  Objecive  and  Structure  of  Dissertation 

The  first  goal  of  this  dissertation  is  to  revisit  the  MPDA  principle  to  strengthen 
the  result  when  the  maximum  singular  value  is  repeated.  Chapter  2  introduces  a  revised 
statement  of  the  MPDA  property  that  fully  considers  the  case  of  a  repeated  maximum 
singular  value.  An  alternative  proof  is  presented  that  is  based  on  the  theory  of  dual 
norms  and  dual  vectors  which  was  the  inspiration  of  the  original  result.    The  MPDA 


results  are  also  used  to  determine  the  upper  bound  on  ju  given  by  the  minimization  over 
a  positive  diagonal  similarity  scaling  of  the  maximum  singular  value.  When  the 
maximum  singular  value  is  distinct  there  exists  an  analytical  expression  for  the  gradient 
of  the  objective  function.  The  first  order  necessary  condition  for  a  minimum  (i.e.,  the 
gradient  being  indentically  0 )  is  equivalent  to  MPDA;  therefore  the  minimum  is  a  tight 
upper  bound.  Chapter  3  investigates  this  optimization  problem  when  the  maximum 
singular  value  is  repeated  such  that  the  gradient  does  not  exist  and  the  objective  function 
is  non-differentiable.  One  result  is  a  method  for  determining  the  subdifferential  when  the 
maximum  singular  value  is  repeated  where  the  subdifferntial  represents  the  set  of  all  sub- 
gradients.  The  necessary  condition  for  a  minimum  is  that  zero  is  an  element  of  the 
subdifferential.  Furthermore,  it  is  shown  that  MPDA  is  still  achievable  when  zero  is  on 
the  boundary  of  the  subdifferential;  otherwise,  MPDA  is  not  attainable  and  the  upper 
bound  on  ju  is  conservative.  Finally,  Chapter  4  gives  a  necessary  condition  for  the 
optimal  similarity  scaling.  The  necessary  condition  requires  that  the  vector  of  diagonal 
elements  of  the  similarity  scaling  be  an  element  of  the  null  space  of  a  matrix  formed  from 
the  absolute  values  of  the  elements  of  the  left  and  right  eigenvectors  of  the  matrix. 

The  second  goal  of  this  dissertation  is  the  extension  of  the  critical  direction  theory 
to  the  more  general  case  where  the  critical  value-set  is  nonconvex.  This  work  is 
presented  in  Chapter  5.  The  key  to  extending  the  theory  is  the  introduction  of  a 
generalized  definition  of  the  critical  perturbation  radius  in  a  fashion  that  preserves  all 
previous  results.  The  nonconvexity  of  the  critical  value  set  is  observed  in  a  number  of 
interesting  problems,  including  the  case  studied  by  Fu  (1990)  consisting  of  rational 
systems  where  the  uncertainty  appears  affinely  in  the  form  of  real  parameters  that  belong 


to  a  known  rectangular  polytope.  The  generalized  critical  direction  theory  is  applied  to 
this  particular  class  of  uncertain  systems,  and  is  used  to  calculate  the  required  Nyquist 
robust  stability  margin  with  high  precision  and  in  the  context  of  a  computationally 
manageable  framework.  Finally,  Chapter  6  proposes  a  practical  design  approach  based 
on  parameter  space  methods  (Siljak,  1989)  to  illustrate  the  utility  of  the  Nyquist  robust 
stability  margin  as  a  measure  of  robust  stability. 

The  final  goal  of  this  dissertation  is  to  perform  a  complete  robust  stability  analysis 
of  classical  proportional-integral  (PI)  controller  design  techniques  based  on  approximate 
first-order-plus-time-delay  models.  The  uncertain  parameters  for  this  problem  are  the 
plant  gain,  plant  time  constant,  and  plant  time  delay.  The  region  of  all  stabilizing 
parameter  perturbations  is  determined.  By  modeling  the  uncertainties  as  multiplicative 
perturbations  it  is  shown  that  the  stability  properties  of  the  closed-loop  system  are  only 
dependent  on  the  time-delay-to-time-constant  controller  tuning  parameter.  The  results 
include  plots  of  the  classical  gain  and  phase  margin  and  the  parametric  stability  margin  as 
a  function  of  the  controller  tuning  parameter  for  the  PI  controller  design  methods 
investigated. 


CHAPTER  2 

A  DUALITY  PROOF  FOR  THE 

MAJOR  PRINCIPAL  DIRECTION  ALIGNMENT  PRINCIPLE 

2.1.  Introduction 

The  structured  singular  value,  ju(M) ,  defined  as  the  supremum  of  the  spectral 

radius  of  MU  over  diagonal  unitary  matrices  U  (Doyle,  1982),  is  a  widely  accepted  tool 
in  the  robust  analysis  of  linear  systems.  It  considers  the  problem  of  robust  stability  for  a 
known  plant  subject  to  a  block-diagonal  uncertainty  structure  under  feedback.  In  general, 
any  block-diagram  interconnection  of  systems  and  uncertainties  can  be  rearranged  into 
the  block-diagonal  standard  form.  Calculating  ju  is  not  trivial;  in  fact  the  underlying 
optimization  problem  has  been  proven  to  be  NP-hard  (Braatz  et  al.,  1994).  The  difficulty 
is  that  the  spectral  radius  is  non-convex  over  the  set  of  unitary  matrix  transformations. 
One  approach  is  to  consider  upper  bounds  for  the  spectral  radius  that  can  be  calculated 
easily,  and  ideally  should  be  attainable  to  eliminate  conservatism.  The  maximum 
singular  value  is  a  reasonable  choice  for  an  upper  bound  because  it  is  invariant  under 
unitary  matrix  transformations.  In  addition  the  maximum  singular  value  upper  bound  can 
be  decreased  by  optimizing  over  similarity  transformations,  because  the  spectral  radius  is 
invariant  under  such  transformations.  Ultimately,  the  problem  becomes  one  of 
conditioning  a  matrix  through  optimal  similarity  and  unitary  transformations  to  achieve 
equality  between  the  spectral  radius  and  maximum  singular  value.  Therefore, 
determining  the  conditions  under  which  the  upper  bound  is  attained  is  a  significant  issue 
in  the  field  of  robust  control. 


In  addressing  the  existence  of  solutions  to  the  proposed  optimization, 
Kouvaritakis  and  Latchman  introduce  the  major  principal  direction  alignment  (MPDA) 
property  (1985).  The  result  states  that  the  spectral  radius  of  a  matrix  is  equal  to  the 
maximum  singular  value  of  the  matrix  if  and  only  if  the  major  input  and  the  major  output 
principal  direction  of  the  matrix  are  aligned.  MPDA  is  a  strict  condition  for  a  matrix,  but 
can  be  used  to  determine  the  optimal  positive  diagonal  matrix  and  unitary  matrix  that 
result  in  equality  between  the  aforementioned  definition  of  ju  and  the  maximum  singular 
value  upper  bound.  The  proof  of  the  MPDA  principle  is  based  on  linear  algebra 
arguments,  and  considers  separately  the  cases  of  a  unique  and  a  repeated  maximum 
singular  value.  For  either  case  the  proof  of  sufficiency  is  straightforward.  The  proof  of 
necessity  for  the  case  of  a  unique  maximum  singular  value  is  precise  but  not  as  clear-cut. 
On  the  other  hand,  the  proof  of  necessity  for  the  case  of  a  repeated  maximum  singular 
value  is  slightly  ambiguous. 

The  inspiration  for  the  MPDA  principle  is  early  work  on  determining  when  the 
spectral  radius  equals  the  maximum  singular  value  for  positive  matrices  transformed  by 
non-negative  diagonal  matrices  (Stoer  and  Witzgall,  1962).  One  motivation  for  the  focus 
on  positive  matrices  is  that  they  have  good  numerical  properties  (i.e.,  less  round  off  error) 
and  therefore  may  be  used  for  conditioning  of  matrices.  In  addition,  positive  matrices 
remain  positive  under  transformations  by  non-negative  diagonal  matrices  leading  to 
connections  to  Perron-roots  /r(M)  (positive  eigenvalues  of  largest  modulus)  of  positive 
matrices  M  (Ortega,  1987).  These  results  on  positive  matrices  are  based  on  the 
mathematical  concepts  of  dual  norms  and  dual  vectors  utilized  by  Bauer  (1962)  which 
lead  to  elegant  proofs  for  many  of  the  results. 
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It  is  the  goal  of  this  chapter  to  revisit  the  MPDA  principle  from  the  viewpoint  of 
duality  theory.  To  facilitate  the  reading,  the  next  section  provides  relevant  mathematical 
background  including  a  discussion  of  the  singular  value  decomposition,  a  summary  of 
dual-norm  and  dual-vector  concepts,  and  a  dual  eigenvalue  result.  Section  2.3  introduces 
the  original  MPDA  theorem.  Section  2.4  provides  a  modified  statement  of  the  MPDA 
principle  theorem  that  explicitly  considers  a  repeated  maximum  singular  value  with  a 
proof  based  on  the  dual-norm  and  dual-vector  theory.  Several  examples  are  given  in 
Section  2.5  and  concluding  remarks  are  made  in  Section  2.6. 

2.2.  Mathematical  Background 
2.2.1.  The  Singular  Value  Decomposition  and  Eigenvalue  Decomposition 

In  this  chapter  only  square  matrices  are  being  considered;  therefore  the  definitions 
are  specialized  for  this  type  of  matrices,  but  it  is  noted  that  the  singular  value 
decomposition  theory  is  applicable  to  generally  rectangular  matrices.  The  singular  value 
decomposition  of  an  arbitrary  matrix  A  eC"*"  is  given  by 

A  =  X(A)Z(A)Y*(A)  (2.1) 

where  2(A):=  diag(al(A),<j2(A),---,an(A))  is  the  diagonal  matrix  of  singular  values 
organized  in  descending  order,  and  X(A)  and  Y(A)  are  unitary  matrices.  The  singular 
values  of  square  matrix  A  gC"x"  are  given  by 

o-i(A):=V>lJ.(A*A),    i  =  l,2,--,n 

where  A(.(A*A)  represent  the  z'-th  eigenvalues  of  the  matrix  A*  A  and  where  the  singular 
values  are  ordered  such  that 

^(A)>a2(A)>-->^(A) 


The  matrices  X(A)  and  Y(A)  are  of  the  form 

X(A)  =  [x,(A)    x2(A)    ...    x„(A)] 

Y(A)  =  [yi(A)     y2(A)    •••    y„(A)] 
where  the  set  of  normalized  left  singular- vectors  (input  principal  directions)  {x^A)}  and 
normalized  right  singular- vectors  (output  principal  directions)  {y,(A)}  for  i  =  1,2,- ••,«  , 

respectively  constitute  orthonormal  eigenbasis  of  AA*  and  A*  A .  Furthermore,  a  pair  of 
singular  vectors  {x/(A),y/(A)}  is  associated  with  each  singular  value  <J((A)  through  the 
relationship 

Ay,(A)  =  <7,.(A)x,.(A)  (2.2) 

The  maximum  singular  value  is  denoted  <r(A) .  It  must  be  noted  that  the 
maximum  singular  value  can  be  associated  with  a  repeated  singular  value,  i.e. 
<t(A)  =  cr,(A)  =  cr2(A)  =•••.  A  maximum  left/right  singular  vector  pair  (or  major 
output/input  principal  direction  pair)  {x(A),y(A)}  is  any  pair  of  left  and  right  singular 
vectors  x,(A)  and  y,(A)  that  correspond  to  the  maximum  singular  value  and  satisfy  (2.2 
).  Necessarily,  a  major  output  principal  direction  and  major  input  principal  direction 
respectively  must  be  elements  of  the  orthonormal  eigensubspace  of  AA*  and  A*A 
associated  with  the  maximum  singular  value. 

In  this  chapter  the  following  definitions  are  used  in  relation  to  the  eigenvalue 
decomposition  (Golub  &  Van  Loan,  1983;  Isaacson  &  Keller,  1966;  Stewart,  1970).  Let 
X-t  (A)  be  an  eigenvalue  of  A ;  then  a  right  eigenvector  v,  of  A  is  any  non-zero  vector 

that  satisfies 

Av..  =  A\; 
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Furthermore,  a  left  eigenvector  w.  of  A  is  any  non-zero  vector  that  satisfies 

w*A  =  Aw* 
The  reader  is  cautioned  that  some  authors  use  the  term  left  eigenvector  for  an  eigenvector 
of  AT.     Finally,  an  eigenvalue  of  maximum  modulus  is  any  eigenvalue  A,(A)  that 
satisfies  |/l(.(A)|  =  p(A). 

2.2.2.  Dual  Norms  and  Dual  Vectors 

In  the  theoretical  development  that  follows,  the  mathematical  concepts  of  dual 
norms  and  dual  vectors  are  utilized.  These  concepts  are  explained  in  a  paper  by  Bauer 
(1962)  and  are  reviewed  here  to  facilitate  the  theoretical  development.  Given  a  vector 
norm  ||  its  dual  vector  norm  |||    is  defined  as 

ii  |,  _      *  Rey*x 

y  L  :  =  max  Re  y  x  =  max  ■ 

hjHd       iivii-1       j         iivii+n 


IWH  l+o    |x|| 

For  such  dual  norms  the  Holder  inequality 

|y|D||x||>Rey*x 

holds  an  is  sharp,  i.e.,  for  any  y0  there  exists  at  least  one  x0 ,  and  for  any  x0  there  exists 
at  least  one  y0  such  that  the  equality  holds  (Bauer,  1962).  If  such  a  pair  (x0,y0)  with 
|y0|D|x0|  =  Rey*x0  also  satisfies  the  scaling  condition 

IWUkH 

it  is  called  a  dual  pair.  Note  that  the  dual  vector  of  x  is  often  written  (x)D.  A  pair 
(x0,y0)  is  strictly  dual  and  is  written  y0|Dx0  if  ||y0|D|x0|  =  y*x0  =  1 .  For  strictly 
homogenous  norms  (i.e.,  those  satisfying  |«x|  =  lal  ||x||  for  all  complex  scalars  a )  the 
Holder  inequality  may  be  sharpened  to  (Bauer,  1 962) 
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|y|UMNy*x 

For  a  dual  pair  (x0,y0)  under  a  homogenous  norm  it  follows  that 
Rey*x0  =||y0|D|x0|>|y*x0|  which  implies  that  Rey*x0  =  yoxo-  Hence,  for  a  strictly 
homogenous  norm  every  pair  of  dual  vectors  (x0,y0)  is  also  strictly  dual  pair.  In 
addition,  there  exist  a  strict  dual  y0  for  any  x0  *  0  and  a  strict  dual  x0  for  any  y0  *  0 . 
Furthermore,  the  concept  of  approximately  dual  vectors  is  proposed  such  that  a  pair 
(x0,y0)  is  approximately  dual  if  ||y0||D||xo||  =  |yoxo|  =  1  • 

In  general,  the  dual  norm  of  a  p-norm  |x||  :=  (^|x,|P)1//7  is  the  associated  p-norm 
HI  ,  where  1/  p  +  l/  q  -I.  So  the  infinity-norm  and  the  1-norm  are  duals,  and  the  dual 
norm  of  the  2  (Euclidean)  norm  is  itself.  For  the  2-norm,  a  pair  (x0,y0)  is  dual  if 
y0  =  x0  /  |x0|  ,  and  approximately  dual  if  y0  =  ej0xo  I  |pc0||  . 

2.2.3.  Dual  Eigenvector  Result 

The  basis  of  the  following  Lemma  is  a  result  of  Bauer  (1962)  on  the  field  of 
values  of  a  matrix. 

Lemma  2.1.  If  the  spectral  radius  of  a  matrix  A  e  Cxn  is  equal  to  the  maximum 
singular  value  of  A ,  then  for  each  normalized  right  eigenvector  vi  associated 
with  an  eigenvalue  of  maximum  modulus  A,  (A)  there  exists  a  normalized  left 
eigenvector  w(.  =  v.  such  that  \t  and  w;  form  a  dual  pair  wj   v,.. 

Proof  Lemma  2.1  is  a  specialization  of  Bauer's  result  to  the  case  of  the 
Euclidean  norm,  and  is  therefore  in  terms  of  the  maximum  singular  value  of  the  matrix. 
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The  proof  makes  use  of  dual  norm  dual-vector  theory  presented  earlier.  Details  are  in 
Appendix  A.  Q.E.D. 

2.2.4.  Eigenvector-Singular  Vector  Equivalence  Result 

The  following  Lemma  is  a  consequence  of  Lemma  2.1. 

Lemma  2.2.  If  the  spectral  radius  of  a  matrix  A  e  C"x"  is  equal  to  the  maximum 
singular  value  of  A ,  then  each  normalized  right  eigenvector  v .  of  A  associated 
with  an  eigenvalue  of  maximum  modulus  A((A)  is  also  a  right  singular  vector  y, 
of  A  associated  with  the  maximum  singular  value  <r(  A) . 

Proof  It  suffices  to  prove  that  vi  is  a  right  eigenvector  of  A*  A  associated  with 
an  eigenvalue  whose  square  root  is  cr(A) ,  because  by  definition  the  rights  singular 
vectors  y.  are  an  orthonormal  eigenbasis  of  A*  A  and  the  singular  values  are  the  square 

roots  of  the  eigenvalues  of  A* A.  First,  from  Lemma  2.1,  it  follows  that  for  each 
normalized  right  eigenvector  v£.  of  A  associated  with  an  eigenvalue  of  maximum 
modulus  A,  (A)  there  exists  a  normalized  left  eigenvector  w.  =  v,.  of  A .  For  each  such 
eigenvector  v, 

A*Av,.  =A\.A,.(A) 

=(V;a)U,.(a) 
=(W;a)*a,.(a) 

=  (A,.(A)w;rA,(A) 

=v;*a;(a)a,(a) 

=  |A,.(A)|2v, 
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=  A,(A*A)v,  (2.3) 

Hence,  from  (2.3)  if  follows  that  vf  is  an  eigenvector  of  A* A  with  eigenvalue  A,.  (A*  A) . 

Finally,  ■N/I^A*A)  =  ^|A,.(A)|2  =  J  a2  (A)  =  a(A)  completing  the  proof.  Q.E.D. 

2.3.  Statement  of  the  Major  Principal  Direction  Alignment  Property 

In  solving  various  robust  control  problems  it  is  necessary  to  determine  the 
conditions  under  which  the  spectral  radius  of  a  matrix  attains  its  maximum  singular- value 
upper  bound.  The  major  principal  direction  alignment  (MPDA)  property  addresses  this 
problem.  Consider  the  singular  value  decomposition  of  a  matrix  A  given  by  (2.1)  where 
2(A)  is  the  diagonal  matrix  of  singular  values  organized  in  descending  order,  and  X(A) 

and  Y*(A)  are  unitary  matrices  whose  columns  are  the  respective  output  and  input 
principal  directions  of  A ,  arranged  in  an  order  conformal  with  the  order  of  the  singular 
values.  The  major  input  principal  direction  y(A)  and  major  output  principal  direction 
x(A)  of  a  matrix  A  are  defined  as  input  and  output  principal  directions  respectively, 
corresponding  to  the  maximum  singular  value,  <r(A)  of  A .  In  addition,  the  major  input 
principal-direction  y(A)  and  the  major  output  principal-direction  x(A)  are  said  to  be 
aligned  if  the  exists  a  real  scalar  6  e  R  such  that  y( A)  =  eJ0x( A) .  The  following 
statement  of  the  Major  Principal  Direction  Alignment  (MPDA)  property  is  found  in 
Kouvaritakis  and  Latchman  (1985). 

Theorem  2.1.  The  spectral  radius  of  any  matrix  AeCnxn  is  equal  to  the 
maximum  singular  value  of  A  if  and  only  if  the  major  input  and  output  principal 
directions  of  A  are  aligned. 
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Proof.  The  proof  consists  of  two  cases,  namely,  when  the  maximum  singular 
value  is  distinct,  and  when  it  is  repeated.  The  proof  is  taken  directly  from  Kouvaritakis 
and  Latchman  (1985)  and  is  relegated  to  Appendix  B.  Q.E.D. 

For  the  case  of  a  distinct  maximum  singular  value,  Theorem  2.1  as  stated  is 
entirely  accurate  and  the  proof  rigorous.  Unfortunately,  when  there  is  a  repeated 
maximum  singular  value,  Theorem  2.1  as  stated  is  not  entirely  accurate  and  the  proof  is 
not  rigorous.  In  the  proof,  Equation  B.5  states  that  the  variable  z  must  assume  a  given 
form  (i.e.,  that  z  =  Y*(A)w  must  be  at  least  one  element  of  the  form).  This  does  not 
mean  that  every  major  input  and  output  principal  direction  pair  results  from 
z=  Y*(A)w ;  instead  it  should  be  interpreted  as  meaning  that  there  is  at  least  one  such 
pair  that  results  from  z  =  Y*(A)w.  Hence,  when  the  maximum  singular  value  is 
repeated,  there  may  exist  a  major  input  and  output  principal  direction  pair  that  is  not 
aligned  even  when  the  spectral  radius  equals  the  maximum  singular  value. 
Counterexamples  are  given  in  the  examples  section.  A  modified  statement  of  MPDA 
with  a  proof  based  on  duality  arguments  is  provided  in  the  next  section. 

2.4.  Modified  Statement  of  the  Major  Principal  Direction  Alignment  Principle 
The  following  theorem  is  a  modification  of  the  MPDA  Theorem  2.1  which 
accurately  takes  into  account  the  case  of  a  repeated  maximum  singular  value. 

Theorem  2.2.     The  spectral  radius  of  any  matrix   A  e  C*     is  equal  to  the 

maximum  singular  value  of  A  if  and  only  if  there  exists  a  major  input  and  major 

output  principal  direction  pair  of  A  that  is  aligned. 

Proof.  To  prove  sufficiency  note  that  alignment  of  a  major  input  and  major 
output  principal  direction  pair  of  A  implies 
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y(A)  =  eje\{X)  (2.4) 

Pre-multiplication  of  equation  (2.4)  by  A  gives 

Ay(A)  =  e^Ax(A)  (2.5) 

The  singular  value  decomposition  of  A  implies 

Ay(A)  =  o=(A)x(A)  (2.6) 

Combining  equation  (2.5)  and  equation  (2.6)  gives 

Ax(A)  =  e-jea(A)x(A) 
so  that  A  -  e~)6a{A)  emerges  as  an  eigenvalue  of  A  with  eigenvector  x(A) .    Noting 
that  the  eigenvalues  of  A  are  always  bounded  from  above  by  <r(A) ,  it  follows  that 

\X\  =  p{  A)  =  a(A) 
To  prove  necessity,  assume  p(A)  =  <r(A) ,  then  from  Lemma  2.2  it  follows  that  any  right 
eigenvector  v(.  of  A  associated  with  an  eigenvalue  of  maximum  modulus  A((A)  is  also 
a  right  singular  vector  y(.  of  A  associated  with  the  maximum  singular  value  cr(A) . 
From  equation  (2.2),  the  corresponding  left  singular  vectors  are 

*i(A)  =  ^ 

a(A) 

=  Avf(A) 

"  |A,.(A)| 

_  /Lt(A)v,.(A) 
K(A)| 

=  eyarg(A'(A))y,(A) 

=  ^y,(A) 
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Therefore,  for  each  orthonormalized  right  eigenvector  v(  there  is  a  major  input/output 
principal  direction  pair  that  is  aligned.  Namely 

y,(^)  =  v/ 

and 

x,(A)  =  ^y,(A)  (2.7) 

where 

0  =  arg(/l.(A))  (2.8) 

Finally,  there  is  always  at  least  one  right  eigenvector  v.  of  A  associated  with  an 
eigenvalue  of  maximum  modulus  /I -(A) ;  therefore,  there  must  exist  at  least  one  major 
input/output  principal  direction  pair  that  is  aligned,  which  completes  the  proof.  Q.E.D. 
Theorem  2.2  is  a  precise  statement  of  the  MPDA  property.  The  theorem 
eliminates  any  ambiguity  that  may  result  when  applying  the  MPDA  property  as  stated  in 
Theorem  2.1  to  the  case  of  repeated  maximum  singular  values.  In  addition,  the  proof  of 
necessity  makes  well-designed  use  of  the  earlier  work  on  dual  vectors  and  dual  norms, 
and  avoids  the  confusions  associated  with  the  earlier  proof.  This  section  is  concluded 
with  a  simple  corollary  that  restates  the  MPDA  property  in  the  duality  terminology, 
namely 

Corollary  2.1.  The  spectral  radius  of  any  matrix  AeC"x  is  equal  to  the 
maximum  singular  value  of  A  if  and  only  if  there  exists  a  major  input  and  major 
output  principal  direction  pair  of  A  that  is  approximately  dual  with  respect  to  the 
Euclidean  norm. 

Proof     It  suffices  to  show  that  approximate  duality  of  a  major  input/output 
principal  direction  pair  with  respect  to  the  Euclidean  norm  is  equivalent  to  alignment  of 
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the  pair.    By  definition,  the  pair  are  approximately  dual  with  respect  to  the  Euclidean 
norm  if  and  only  if 


v  -  jo 


T  =  eJ°y/  y 


Principal  directions  are  always  normalized;  therefore  (2.9)  is  equivalent  to 


x  =  ejey 


which  is  exactly  the  condition  for  alignment  completing  the  proof. 

2.5.  Examples 
2.5.1.  Example  2.1 

Consider  the  matrix 


(2.9) 


Q.E.D. 


A  = 


with  eigenvectors 


-0.9026-1.0077/  0.2586-0.1506/  0.1661  +  0.2372/ 
0.6086  +  0.2053/  1.2588  - 1.1670/  -0.6442  +  0.2239/ 
0.6487  +  0.2968/    -0.5918  -  0.4665/      0.1641  - 1.4383/ 


(V1>V2>V3}= 


-0.0687  +  0.1159/ 
-0.8719  +  0.2183/ 
0.3920  +  0.1421/ 


0.1807  +  0.1816/ 
0.3478-0.2425/ 
0.8670  +  0.0538/ 


0.6834-0.0523/ 
-0.3628-0.1105/ 
-0.4576-0.4208/ 


and  associated  eigenvalues 

{ X , ,  A  2 ,  A  3 }  =  {2.0000e° 6000y  ,1.2503e1 666oy  ,l5996e2255SJ  } 

and  singular  value  decomposition  A  -  XZY* ,  where 


X  =  [x,     x2     x3] 


-0.0018  +  0.0876/  0.4121  +  0.3962/  0.4426  +  0.6853/ 
0.274 1  +  0.8 1 1 7/  -0.4542  +  0.0343/  0.242 1  -  0.006 1/ 
0.2903  -  0.4 1 73/      -0.4806  -  0.4845/     0.249 1  +  0.4624/ 


Y  =  [y, 


y3]= 


0.1556  -0.7481 

-0.2965  +  0.873 1/    -0.0272  -  0.1 299/ 

0.3367-0.1101/      0.5404-0.3615/ 


-0.6451 
-0.0400  +  0.3612/ 
-0.5455  +  0.3927/ 


(2.10) 


and  the  singular  values  are 

{o-,,<T2,cr3}  =  {2,2,1} 

The  spectral  radius  equals  the  maximum  singular  value,  i.e. 

|A,|  =  p(A)  =  a(A)  =  crl  =<j2 

In  this  case  the  eigenvalue  of  maximum  modulus  is  unique  and  non-repeated,  and  the 
maximum  singular  value  is  repeated.  An  inspection  of  the  left  and  right  singular  vectors 
reveals  that  x,  *  ej9yl  and  x2  *  ej9y2  which  appears  to  contradict  the  MPDA  Theorem 
2  which  states  that  there  must  exist  at  least  one  major  input/output  principal  direction  pair 
that  is  aligned.  This  apparent  contradiction  can  be  resolved  by  realizing  (2.10)  is  only 
one  possible  orthonormal  eigenbasis  of  A*A  whose  vectors  are  right  singular  vectors. 
Different  orthonormal  eigenbasis  of  A* A  are  achieved  through  unitary  transformations 
of  the  orthonormal  bases  of  the  eigenspaces  of  A*  A  associated  with  each  particular 
singular  value.  The  eigenspace  of  A*  A  associated  with  a  non-repeating  singular  values 
is  rank  one;  therefore  an  orthonormal  basis  consists  of  only  one  vector  and  the  only 
unitary  transformation  of  this  basis  is  of  the  form  e'e .  On  the  other  hand,  the  eigenspace 
of  A* A  associated  with  a  repeating  singular  value  has  rank  greater  than  one,  and 
therefore  an  orthonormal  basis  consists  of  more  than  one  vector  and  a  unitary 
transformation  of  this  basis  is  a  unitary  matrix  whose  size  is  the  rank  of  the 
corresponding  eigenspace. 

Hence,  for  this  example,  there  must  exist  a  unitary  matrix  that  transforms  the  left 
singular  vectors  x,  and  x2  into  x,  and  x2  such  that  at  least  one  of  the  transformed  left 
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singular  vectors  is  aligned  with  the  corresponding  right  singular  vectors  y\  and  y2 .  The 
problem  becomes  finding  a  matrix  U  such  that 

[X;     x'2]  =  [Xl     x2]U  (2.11) 

[y'i   yi]=[y,   y2]u  (2.12) 


and 


*i=eJ0y\ 


with  unitary  constraint 


U*U  =  I 

The  solution  can  be  found  by  solving  the  system  of  equations  that  equates  the  moduli  of 

the  elements  of  x,  and  y,  and  that  constrains  the  arguments  of  elements  of  x,  and  y,  to 
differ  by  6 ,  where  the  unknowns  are  the  elements  of  U  and  the  variable  0 .  Although 
this  is  a  simple  problem  in  complex  algebra,  the  resulting  set  of  equations  have  many 
terms  and  are  relatively  cumbersome.  Further  theoretical  work  in  this  area  is  discussed  in 
Chapter  3.  Therefore,  an  alternative  method  is  used  to  solve  the  problem.  First,  from 
Lemma  2.2  it  follows  that  the  right  eigenvector  v,  is  also  right  singular  vector  yj ; 
therefore,  if  U  =  [u,  u2]  then  the  first  part  of  the  problem  becomes  finding  a 
normalized  iij  such  that 

yl=v.=[yi   y2]«.  (2-13) 

The  normalized  least  squares  solution  to  (2.13)  is 


«!    = 


[Vi   y2]+v, 


[yi   y2]+v, 


0.5548  +  0.8057/ 
0.2072  +  0.0126/ 
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where  [•]    denotes  the  Moore-Pinrose  pseudo-inverse  (Ortega,  1987).    The  second  and 
last  part  of  the  problem  is  to  choose  u2  such  that  U  is  unitary.  One  choice  is 


0.5548  +  0.8057/  0.2075 

0.2072  +  0.0126/    -0.6026  +  0.7705/ 


U  =  [u,     u2]  = 

Now  using  the  relationships  (2.11)  and  (2.12)  and  defining   X  =  [x,     x2     x3l    and 
Y '  =  [y'i     Yi     y3]    yields  an  alternative  singular  value  decomposition    A  =  X  EY  * 


where 


x'  =  [x;  x2  x;]  = 


Y'=[y.    y'i   y'3]  = 


0.0088  +  0.1345/  -0.5540  +  0.0969/  0.4426  +  0.6853/ 

-0.5964  +  0.6725/  0.3042-0.2021/  0.2421-0.0061/ 

0.4038-0.1041/  0.7232-0.1649/  0.2491  +  0.4624/ 

-0.0687  +  0.1159/  0.4831-0.5764/  -0.6451 

-0.8719  +  0.2183/  0.0549  +  0.2386/  -0.0400  +  0.3612/ 

0.3920  +  0.1421/  0.0228  +  0.6114/  -0.5455  +  0.3927/ 


and  the  singular  values  again  are 

{<Tlt(J2,(T3}  =  {2,2,1} 
Finally,  the  apparent  contradiction  of  Theorem  2.2  is  resolved  by  verifying 

x;=^(A,)yi  =  e-o.«ooyy- 

Note  that  x2  ^  ejey2  even  though  cr2  is  equal  to  the  maximum  singular  value.  A 
reasonable  question  now  is  whether  it  is  possible  to  choose  u2  such  that  x2  is  also 
aligned  with  y2  ?  The  answer  is  no.  This  is  proved  as  follows.  By  construction  y,  is  an 
eigenvector  of  A  corresponding  to  an  eigenvalue  of  maximum  modulus,  namely  v, . 
Next,  it  can  be  shown  (see,  Theorem  2.1,  proof  of  sufficiency)  that  alignment  of  major 
input  and  major  output  principal  directions  implies  that  the  major  principal  directions  are 
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both  necessarily  eigenvectors  of  A .  Now,  assume  that  is  possible  to  choose  u2  such  that 
x'2  and  y2  are  aligned.  Necessarily,  y2  is  also  an  eigenvector  of  A  which  implies  that 
there  are  two  linearly  independent  eigenvectors,  y,  and  y2 ,  associated  with  the 
eigenvalue  of  maximum  modulus.  Hence,  the  eigenvalue's  geometric  multiplicity  is 
greater  than  one.  Furthermore,  for  this  case  the  eigenvalue's  algebraic  multiplicity  is  one, 
but  an  eigenvalues  geometric  multiplicity  can  not  exceed  its  algebraic  multiplicity.  This 
is  an  obvious  contradiction,  therefore  the  assumption  is  false. 

The  result  that  it  is  not  possible  to  achieve  alignment  of  all  the  major  input  and 
major  output  principal  directions  is  not  compatible  with  the  original  statement  of  the 
MPDA  property  as  given  in  Section  2.3.  However,  it  is  compatible  with  the  revised 
version  of  the  MPDA  property  of  Section  2.4,  which  allows  for  a  major  input/output 
principal  direction  pair  that  is  not  aligned  as  long  as  at  least  one  other  input/output 
principal  direction  pair  that  is  aligned  as  is  the  case  in  this  example. 
2.5.2.  Example  2.2 

Consider  the  matrix 


A  = 


1.7907  -  0.8729/  -0.0780  +  0.0482/  0.0085  +  0.1 5 1 1/ 
0.0827-0.0396/  1.6645-1.1040/  0.0475-0.0001/ 
0.1225  +  0.0888/    -0.0258  +  0.0399/     1.6883-1.0605/ 


with  eigenvectors 


{v„v2,v3}  =  < 


0.8554  -  0.0000/ 
0.0145-0.2681/ 
0.4002-0.1901/ 


-0.0224-0.4144/ 
0.0631  +  0.4205/ 
0.5880  +  0.5489/ 


0.0187  +  0.1611/ 

-0.5177  +  0.7269/ 

0.3707-0.1999/ 


and  associated  eigenvalues 

{  X , ,  X  2 ,  A  3 }  =  {2.00006° A000j  ,2.0000e°  600oy  ,2.0000e"0  600oy } 
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and  singular  value  decomposition  A  =  XZY* ,  where 


X  =  [x,  x2  x3]  = 


-0.0381  +  0.0008/  0.8954  -  0.4364/  0.0789  +  0.0124/ 
0.9550-0.2457/  0.0414-0.0198/  -0.0952-0.1281/ 
-0.1616-0.0001/  0.0613  +  0.0444/  -0.3854-0.9053/ 


Y  =  [y,    y2    y3]  = 


0.0000  1.0000  0.0000 

0.9340 +  0.3268i     0.0000    -0.0243 -0.1422i 
-0.1138-0.0887i    0.0000     0.1537 -0.9775i 


and  the  singular  values  are 

{o-,,o-2,cr3}  =  {2,2,2} 
The  spectral  radius  equals  the  maximum  singular  value,  i.e. 

|A,|  =  |A2|  =  |/l3|  =  p(A)  =  <r(A)  =  cr,  =  <t2  =  cr3 

where  there  are  two  eigenvalues  of  maximum  modulus  with  one  being  non-repeated  and 
the  other  having  a  multiplicity  of  two.  The  maximum  singular  value  is  associated  with  a 
repeated  singular  value  of  multiplicity  three.  Again,  inspection  of  the  left  and  right 
singular  vectors  reveals  that  x,  *  ej6yx ,  x2  *  ejey2 ,  and  x2  ^  ej9y2 .  From  Theorem  2.2, 
it  is  known  that  there  is  at  least  one  major  input/output  principal  direction  pair  that  is 
aligned,  but  it  is  not  apparent  if  there  are  more  than  one.  The  possibility  exists  that  all 
three  can  be  aligned  through  a  unitary  transformation,  because  there  are  three 
independent  eigenvectors  associated  with  the  eigenvalue  of  maximum  modulus.  The 
unitary  matrix  that  transforms  all  three  singular  vectors  such  that  all  input/output 
principal  directions  pairs  are  aligned  is  not  found  by  solving  the  resulting  system  of 
complex  algebraic  equations,  because  the  equations  are  even  more  cumbersome  than 
would  for  the  previous  example.  In  fact,  in  this  example,  the  existing  singular  value 
decomposition   as   not   transformed   at   all.      Instead,    an   alternative   singular  value 
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decomposition  is  constructed  from  the  three  eigenvectors  associated  with  the  eigenvalue 
of  maximum  modulus.  First,  one  right  singular  vector  is  obtained  from  the  eigenvector 
associated  with  the  eigenvalue  that  is  not  repeated,  i.e.,  y,  =  v,  as  dictated  by  Lemma  2.2 
and  the  corresponding  left  singular  is  given  by 

x;  =ej*T^)y\=e-0A000jy\ 
according  to  (2.7)  and  (2.8)  of  Theorem  2.2.  The  remaining  two  right  singular  vectors 
are  obtained  from  v2  and  v3,  the  eigenvectors  associated  with  repeated  eigenvalue  of 
maximum  modulus.  It  can  be  shown  that  both  of  these  eigenvectors  are  eigenvectors  of 
A* A,  but  that  alone  does  not  make  them  both  right  singular  vectors,  because  singular 
vectors  are  obtained  from  the  orthonormal  eigenbasis  of  A*  A  .  It  is  easy  to  show  that  v, 
is  normal  and  orthogonal  to  v2  and  v3.  Therefore,  the  remaining  step  is  to 
orthonormalize  v2  and  v3.  One  such  orthonormalization  is 

y'2  =  v2 

•    v3-v;v3-v2 


v3-v2v3-v2 


The  corresponding  left  singular  vectors  are  then  given  by 


x-  =c;«**2)y-  =e-°™°Jy2 


x3  =  ej^^y\  =  e-°-6000jy3 


The  alternative  singular  value  decomposition  is  A  =  X  IY  *  where 


X'=[X;  x'2  x'3]  = 


0.7878-0.3331/  -0.2525-0.3294/  0.2144  +  0.2240/ 
-0.0910-0.2525/  0.2895  +  0.3114/  -0.1311  +  0.8544/ 
0.2946-0.3310/  0.7952  +  0.1210/  -0.0666-0.3902/ 


Y'=[y>    y2    y'3]  = 
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0.8554  -  0.0000/  -0.0224  -  0.4144/  0.0505  +  0.3059/ 
0.0145  -  0.2681/  0.0631  +  0.4205/  -0.5907  +  0.631 1/ 
0.4002-0.1901/     0.5880  +  0.5489/      0.1654-0.3597/ 


and  the  singular  values  are  still 

{cr1,o-2,o-3}  =  {2,2,2} 
By  construction,  the  input/output  principal  direction  pairs  are  aligned  as  follows 

*;  =  e-0-6000'y2 

x'3  =  e-°^y3 
This  result  shows  that  it  is  possible  for  several  pairs  of  input/output  principal  directions  to 
be  aligned  when  there  is  singular  value  multiplicity.    Note  that  the  alignment  factors, 
however,  are  not  necessarily  identical. 

2.6.  Conclusions 
This  chapter  clarifies  the  implications  of  the  MPDA  principle  by  explicitly 
considering  the  case  of  a  repeated  maximum  singular  value.  An  alternative  proof  of  the 
necessity  of  the  MPDA  property  is  presented  that  is  based  on  dual  norm  and  vector 
theory.  This  proof  shows  the  ties  the  MPDA  property  has  to  the  earlier  duality  work 
which  partly  inspired  it.  Examples  show  that  the  alignment  properties  of  the  input/output 
principal  direction  pairs  associated  with  maximum  singular  value  are  directly  related  to 
the  eigenvectors  associated  with  eigenvalues  of  maximum  modulus  in  terms  of  both  the 
multiplicity  and  the  amount  of  alignment. 


CHAPTER  3 

MAJOR  PRINCIPAL  DIRECTION  ALIGNMENT  WHEN 

THE  MAXIMUM  SrNGULAR  VALUE  IS  REPEATED  AND 

ITS  RELATIONSHIP  TO  OPTIMAL  SIMILARITY  SCALING 

3.1.  Introduction 

The  Major  Principal  Direction  Alignment  (MPDA)  theory  yields  a  necessary  and 
sufficient  condition  for  the  spectral  radius  of  a  matrix  to  equal  its  maximum  singular 
value  (Kouvaritakis  and  Latchman,  1985).  This  has  been  proved  using  duality  arguments 
in  Chapter  2  where  it  is  shown  that  the  results  hold,  even  for  the  case  of  a  repeated 
maximum  singular  value.  The  primary  reason  for  the  development  of  the  MPDA 
principle  is  to  solve  the  structured-singular- value  /  ju  problem,  that  is  often  written  in  the 
form  (Doyle,  1982) 

sup  p(MU)  =  u(M)  <  inf  j(DMD  ')  (3.1) 

Where    MeCnxn,    V:=[diag(eJ0i  ,ej0\---,ej9")\o<0i  <2ttJ  ^\,2,---,n]    is  the  set  of 

diagonal  unitary  matrices  and  ©:=  {diag( dv d2,---,dn )\dt  >0,i  =  1,2, •••,«}  is  the  set  of 
positive  diagonal  matrices.  In  equation  (3.1),  p  represents  the  spectral  radius,  (j.  the 
structured  singular  value,  and  a  the  maximum  singular  value.  The  supremization  over 
V  is  known  to  be  an  NP-hard  and  non-convex  optimization  problem  (Braatz,  1994); 
therefore,  when  using  standard  optimization  techniques  there  is  always  the  problem  of 
local  verses  global  optima.  On  the  other  hand,  the  infimization  over  ID)  can  be  shown  to 
be  a  convex  optimization  problem  (Safonov  and  Verma,  1985;  Tzafestas,  1984; 
Latchman,    1986)    and   the   global   minima   can   be   determined   via   an   appropriate 
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optimization  technique.    However,  as  (3.1)  implies,  in  general  this  yields  only  an  upper 
bound  on  ju  . 

The  MPDA  theory  shows  that  if  the  maximum  singular  value  is  distinct  for  a 
given  D,  then  there  is  an  analytic  expressions  for  the  gradient  <9<t(DMD_i)/<9D.  From 
this  expression  for  the  gradient,  the  condition  for  a  stationary  point  {i.e., 
d<7(DMD"')/3D  =  0)  implies  that  the  moduli  of  the  input  and  output  principal 
directions  are  elementwise  equal.  Therefore,  if  at  the  infimum  the  maximum  singular 
value  is  distinct,  then  the  gradient  exists  and  is  identically  zero,  and  the  moduli  of  the 
input  and  output  principal  directions  are  pairwise  equal.  In  addition,  a  unitary 
transformation  matrix  U  (note  the  maximum  singular  value  is  invariant  under  unitary 
transformations)  can  be  determined  that  shifts  the  angles  of  the  elements  of  the  input  or 
output  principal  direction  such  that  MPDA  is  achieved,  and  therefore  the  upper  bound  is 
tight  and  the  value  of  fj.  is  determined  by  solving  a  convex  optimization  problem. 

In  general  the  maximum  singular  value  is  not  unique  for  a  given  scaling  D .  This 
work  investigates  further  the  situations  that  arise  when  the  maximum  singular  value  is 
repeated.  There  are  two  aspects  of  this  problem  that  are  investigated.  The  first  aspect  is 
the  effect  the  repeated  maximum  singular  value  has  on  the  optimization  over  D ,  with 
specific  interest  on  gradient  search  methods.  The  second  aspect  is  the  attainability  of 
MPDA  when  the  maximum  singular  value  is  repeated  for  the  optimal  scaling.  Finally, 
this  work  attempts  to  reconcile  the  results  obtained  with  those  of  the  principal  direction 
alignment  (PDA)  principle  (Daniel  et  ai,  1986). 
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3.2.  Mathematical  Background 
3.2.1.  The  Singular  Value  Decomposition 

The  following  definitions  are  associated  with  the  singular  value  decomposition 
(Ortega,  1987).  In  this  chapter  only  square  matrices  are  being  considered,  therefore  the 
definitions  are  specialized  for  the  case  of  square  matrices,  but  it  is  noted  that  the  singular 
value  decomposition  theory  is  applicable  to  rectangular  matrices. 

The  singular  value  decomposition  of  an  arbitrary  matrix  A  eC"x"  is  given  by 

A  =  X(A)E(A)Y*(A)  (3.2) 

where  Z(A):=  diag(cri(A),cr2(A),---,crn(A))  is  the  diagonal  matrix  of  singular  values 
places  in  descending  order,  and  X(A)  and  Y(A)  are  unitary  matrices.  The  singular 
values  of  square  matrix  A  eC"*"  are  given  by 

ct(.(A):=Va,(A*A),    /  =  1,2,---,h 

where  A,.  (A*  A)  represent  the  z'-th  eigenvalues  of  the  matrix  A*  A  and  where  the  singular 
values  are  ordered  such  that 

cr1(A)>cr2(A)>.^<Tll(A) 
The  matrices  X(A)  and  Y(A)  are  of  the  form 

X(A)  =  [x,(A)    x2(A)     -    x„(A)] 

Y(A)  =  [y,(A)    y2(A)     -    y„(A)] 
where  the  set  of  normalized  left  singular- vectors  (input  principal  directions)  {x.(A)}  and 
the  set  of  normalized  right  singular-vectors  (output  principal  directions)   {y,(A)}   for 
i  =  1,2,  ••-,«,  respectively  constitute  orthonormal  eigenbasis  of  AA*  and  A*  A ,  such  that 

AA\(A)  =  o-,2(A)x,(A) 
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and 

A*Ay,.(A)  =  <7,2(A)y,(A)  (3.3) 

Furthermore,  a  pair  of  singular  vectors  {x((A),y,(A)}  is  associated  with  each  singular 
value  <r,(A)  through  the  relationship 

Ay,.(A)  =  a,.(A)x,(A)  (3.4) 

The  maximum  singular  value  is  defined  as  <r(A):=  <r,(A).  The  maximum 
singular  value  can  be  associated  with  a  repeated  singular  value,  i.e.  <t(A)  =  <7,(A) 
=  cr2(A)  =•••=  <rr(A) ,  where  r  <  n  is  the  multiplicity.  A  maximum  left/right  singular 
vector  pair  (or  major  output/input  principal  direction  pair)  {x(A),y(A)}  is  any  pair  of 
left  and  right  singular  vectors  that  corresponding  to  the  maximum  singular  value  and 
satisfy  (3.4).  Necessarily,  a  major  output  principal  direction  and  major  input  principal 
direction  respectively  must  respectively  be  normalized  elements  of  the  eigensubspaces  of 
AA*  and  A* A  associated  with  the  maximum  singular  value.  If  {x,(A)}  for  i  =  1,2,- --,r 
and  {y((A)}  for  i  — 1,2, •••,/*  are  orthonormal  bases  for  these  eigensubspaces  that  satisfy 
(3.4),  then  any  and  all  major  output  principal  directions  and  major  input  principal 
directions  are  respectively  given  by 

x(A)  =  [x,(A)     x2(A)     •••    xr(A)]u  =  £x<(AK  (3-5) 

and 


i=i 


;=i 


y(A)  =  [y,(A)     y2(A)     -    yr(A)]u  =  5>,(A)k,.  (3.6) 

where  u  e  Cr  satisfies 

u*u  =  1 
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that  is,  u  must  be  on  the  unit  ball  in  Cn . 

3.2.2.  Statement  of  the  Major  Principal  Direction  Alignment  Principle 

The  following  theorem  is  a  modification  of  the  MPDA  principle  as  proposed  in 
Kouvaritakis  and  Latchman  (1985)  which  takes  into  account  the  case  of  a  repeated 
maximum  singular  value. 

Theorem  3.1.  The  spectral  radius  of  a  matrix  A  e  Cnxn  is  equal  to  the  maximum 
singular  value  of  A  if  and  only  if  there  exists  a  major  input  and  major  output 
principal  direction  pair  of  A  that  is  aligned,  i.e.  there  exists  a  pair  {x(A),y(A)} 
such  that 

y(A)  =  eJ'*x(A)  (3.7) 

for  some  0  e  [0,2/r) . 

Proof.  The  proof  is  given  in  Kouvaritakis  and  Latchman  (1985),  and  Chapter  2 
offers  an  alternative  proof  based  on  the  theory  of  dual  vectors  and  dual  norms.  Q.E.D. 
Given  the  optimal  matrices  U°  and  D°,  Theorem  3.1  gives  a  necessary  and 
sufficient  condition  for  the  left  hand  side  (spectral  radius)  and  right  hand  side  (maximum 
singular  value)  of  (3.1)  to  hold  with  equality.  It  is  apparent  that  equation  (3.7)  requires 
that  the  major  input  and  major  output  principal  directions  have  element-by-element  equal 
moduli  and  a  constant  element-by-element  phase  difference. 

3.2.3.  Affine  Sets,  Convex  Sets,  and  Convex  Functions 

If  x  and  y  are  different  point  in  Rn ,  the  set  of  points  of  the  form 

(1  -  /l)x  +  Ay  =  x  +  A(y  -  x),    X  eR 
is  called  the  line  through  x  and  y .  A  subset  M  of  Rn  is  called  an  affine  set  if 

{\-X)\  +  XyeM    Vx  e  M,y  e  M,X  eR 
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In  general,  an  affine  set  has  to  contain,  along  with  any  two  different  points,  the  entire  line 
through  those  points.  The  intuitive  picture  is  that  of  an  endless  uncurved  structure,  like  a 
line  or  a  plane  in  space.  The  subspaces  of  Rn  are  the  affine  sets  which  contain  the 
origin.  The  dimension  of  a  non-empty  affine  set  is  defined  as  the  dimension  of  the 
subspace  parallel  to  it  (the  dimension  of  the  empty  set  is  -1  by  convention).  Affine  sets 
of  dimension  0,  1  and  2  are  called  points,  lines,  and  planes,  respectively.  An  (n-1)- 
dimensional  affine  set  in  Rn  is  called  a  hyperplane.  Hyperplanes  and  other  affine  sets 
may  be  represented  by  linear  functions  and  linear  equations.   For  example,  given  (3  e  R 

and  a  non-zero  b  e  R n ,  the  set 

// =  {x|xTb  = /?}  (3.8) 

is  a  hyperplane  in  Rn .  Moreover,  every  hyperplane  may  be  represented  in  this  way,  with 
P  and  b  unique  up  to  a  common  non-zero  multiple.  For  any  non-zero  b  e  Rn  and  any 
P  e  R ,  the  sets 

{x|xTb</?},     {x|xTb>/?} 
are  called  closed  half-spaces.  The  sets 

{x|xTb</?},     {x|xTb>/?} 

are  called  open  half-spaces.  These  half-spaces  depend  only  on  the  hyperplane  H  given 
by  (3.8).  Therefore,  one  may  speak  unambiguously  of  the  open  and  closed  half-spaces 
corresponding  to  a  given  hyperplane.  Finally,  the  intersection  of  an  arbitrary  collection 
of  affine  sets  is  again  affine.  Therefore,  given  any  S  cz  Rn  there  exists  a  unique  smallest 
affine  set  containing  S  .  This  set  is  called  the  affine  hull  of  S  and  is  denoted  affS  . 
A  subset  C  of  Rn  is  said  to  be  convex  if 
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(1  -  A)x  +  Ay  e  C    Vx  eC,y  eQA  e (0,1) 
All  affine  sets  are  convex,  as  are  half-spaces.  A  vector  sum 

Alxl+A2\2+---+Am\m 
is  called  a  convex  combination  of  x,,x2,---,xm  if  the  coefficients  Xi  are  all  non-negative 

and  A,  +  A2+'"+/^m  =  1-  ^  subset  of  i?n  is  convex  if  and  only  if  it  contains  all  the 
convex  combinations  of  its  elements.  The  intersection  of  all  the  convex  sets  containing  a 
given  subset  S  of  Rn  is  called  the  convex  hull  of  S  and  is  denoted  convS .  Necessarily, 
convS  is  the  smallest  convex  set  containing  S .  In  addition,  for  any  S  c  Rn ,  convS 
consists  of  all  the  convex  combinations  of  the  elements  of  S .  In  general,  by  the 
dimension  of  a  convex  set  C  one  means  the  dimension  of  the  affine  hull  of  C . 

A  supporting  half-space  to  a  convex  set  C  is  a  closed  half-space  which  contains 
C  and  has  a  point  of  C  in  its  boundary.  A  supporting  hyperplane  to  C ,  is  a  hyperplane 
which  is  the  boundary  of  a  supporting  half-space  to  C .  As  such,  a  supporting  hyperplane 
to  C  is  associated  with  a  linear  function  which  achieves  its  maximum  on  C .  The 
supporting  hyperplanes  passing  through  a  given  point  a  e  C  correspond  to  vectors  b 
normal  to  C  at  a ,  as  defined  by  (3.8). 

Let  a  function  /(d)  be  de  defined  on  a  convex  set  S  c  Rn  (note,  for  the  MPDA 
problem  /(d):  =  a(DMD_1)  where  D  =  diag(d)  and  S  is  the  positive  orthant  such  that 
Del).  In  what  follows,  it  is  assume  that  the  function  /(d)  is  finite  on  its  domain  of 
definition.  The  function  /(d)  is  said  to  be  convex  on  S  if 

f(a&x+(\-a)d2)<af{A,)  +  {\-a)f{&2)    Vd,,d2  eS,    a  e[0,l] 
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A  concave  function  on  S  is  a  function  whose  negative  is  convex.  An  affine  function  on 
S  is  a  function  which  is  convex  and  concave.  The  set  {(d,/(d))  e  Rn+l  d  gS}  is  the 
graph  of  the  of  the  function  /(d)  defined  on  the  set  S  .  The  set 

epi/:=  {(d,/?)  e  Rn+l  |d  e  S,/?  e  #,/?  >  /(d)} 

is  called  the  epigraph  of  the  function  /(d)  defined  on  the  set  S .    The  epigraph  of  a 
convex  function  is  a  convex  set. 
3.2.4.  Differential  Theory 

A  vector  £,  is  said  to  be  a  subgradient  of  f:S  <z  /?n  — >  /?  at  d  e  <S  if 

/(g)>/(d)  +  ^T(g-d),    VgeS  (3.9) 

This  condition,  which  is  referred  to  as  the  subgradient  inequality,  has  a  simple  geometric 
meaning:  it  says  that  the  graph  of  the  affine  function  /z(g)  =  /(d)  +  £T(g  -  d)  is  a  non- 
vertical  supporting  hyperplane  to  the  convex  set  epi/  at  the  point  (d,/(d)) .  The  set  of 
all  subgradients  of  /  at  d  is  called  the  subdifferential  of  /  at  d  and  is  denoted  by 
df(d) .  The  multivalued  point-to-set  mapping  df:d  ->  5/(d)  is  called  the  subdifferential 
of  /  .  Obviously,  df(d)  is  a  closed  convex  set,  since  by  definition  £  e  df(d)  if  and  only 
if  \  satisfies  a  certain  infinite  system  of  weak  linear  inequalities  (one  for  each  g  of 
(3.9)).  In  addition,  df(d)  is  also  nonempty  and  bounded. 

The  directional  derivative  of  /  at  d  e  S  in  the  direction  of  g ,  denoted  /'(d;g) , 
is  defined  by  the  limit 

A->0+  X 
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if  it  exists.  Notably,  for  convex  functions  the  directional  derivative  /'(d;g)  exists  for  all 
deS  and  for  all  g  e  R" .  Dem'yanov  and  Vasil'ev  (1985)  show  that  the  relation 

/'(d;g)  =    max  £Tg  (3.10) 

holds. 

The  function  /  is  differential  at  d  e  S  if  and  only  if  there  exists  a  vector  V/(d) 
(necessarily  unique),  called  the  gradient,  for  which 

/(g)  =  /(d)  +  V/T(d)(g-d)  +  0(|g-d|) 

or,  equivalently, 

lim/(g)-/(d)-V/T(d)(g-d)=0 
g_>d  ||g  -  d|| 

If  /   is  a  convex  function  then  /  is  differential  at  de5  if  and  only  if  the  partial 

derivatives  exists.  In  addition,  the  gradient  is  given  by 

a/(d)    a/(d)        df(df 


V/(d)  = 


dd]         dd2  ddn 


and  /  has  only  one  subgradient,  namely  the  gradient  V/(d) ,  such  that 

df(d)  =  {V/(d)>  (3.11) 

Also, 

/(g)>/(d)  +  V/T(d)(g-d),    VgG,S 
That  is,  V/(d)  is  the  normal  of  the  tangent  supporting  hyperplane  of  epi/  at  d . 

With  the  terminology  of  differential  theory  thus  developed,  several  important 
theorems  are  given  that  are  used  in  the  sequel.  The  first  theorem  describes  the  set  of 
points  where  /   is  differentiate.     This  theorem  is  used  as  the  basis  of  the  primary 
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assumption  made  in  the  results  section,  namely,  that  the  function  being  considered  (i.e., 
/(d):=  <j(DMD  ') )  is  nondifferentiable  only  at  points  in  its  domain.  The  second 
theorem  gives  a  characterization  of  the  subdifferential  that  is  used  to  construct  an 
expression  for  df(d)  when  the  function  is  nondifferentiable. 

Theorem  3.2.   Let  f  be  a  convex  function  defined  on  a  convex  set  S  a  Rn ,  and 
let  D  be  the  set  of  points  where  f  is  differ entiable.    Then  D  is  a  dense  subset 

S ,  and  its  complement  in  S ,  given  by  D ,  is  a  set  of  measure  zero.  Furthermore, 
the  gradient  mapping  V/:d  ->  V/(d)  is  continuous  on  D . 

Proof.  Two  different  proofs  are  given  in  Dem'yanov  and  Vasil'ev  (1985)  and 
Rockafeller  (1972).  Both  proofs  are  based  on  measure  theory,  and  show  that  there  are 
countable  number  of  sets  where  /  is  not  differentiable.  Q.E.D. 

Theorem  3.2  essentially  states  that  /  is  differentiable  almost  everywhere  in  S  . 
Theorem  3.3.  Let  f  be  a  convex  function  defined  on  a  convex  set  S  a  Rn,  and 
let  D  be  the  set  of  points  where  f  is  differentiable.   Then 

df(d)  =  conv{zeRn\z=  \imVf(dk),dk->d,dk  eD} 

Proof.  Again,  two  different  proofs  are  given  in  Dem'yanov  and  Vasil'ev  (1985) 
and  Rockafeller  (1972).  Both  proofs  use  the  continuity  of  the  gradient  on  D  given  in 
Theorem  3.2  to  show  that  the  limit  sequences  exist  and  that  they  converge  to  exposed 
points  of  df(d) .  Therefore  the  df(d)  is  the  convex  hull  of  all  such  limit  sequences.  Q.E.D. 

As  presented  Theorem  3.3  seems  of  little  practical  value,  because  to  construct 
df(d)  from  it  requires  the  construction  of  an  infinite  number  of  limit  sequences.  This 
not  the  case  as  is  shown  in  the  results  section.  The  next  to  theorems  deal  with  solving  the 
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optimization  problem  of  infimizing  /(d) .    One  gives  conditions  for  an  infimum,  the 
other  gives  an  expression  for  the  steepest  desent  direction. 

Theorem  3.4.  For  the  convex  function  /(d)  to  obtain  its  optimum  value  on  S  at 
the  point  d0,  it  is  necessary  and  sufficient  that 

Oed/(d0) 
Proof.   A  detailed  proof  is  given  in  Dem'yanov  and  Vasil'ev  (1985).   Basically, 
the  condition  is  sufficient,  because   epi/    is  entirely  above  a  horizontal  supporting 
hyperplane  at  d0.  The  condition  is  necessary,  because  if  0  g  df(dQ)  then  it  is  possible  to 
find  a  direction  that  would  decrease  /(d0)  such  that  /(d0)  is  not  optimum.  Q.E.D. 

The  steepest  decreasing  direction  is  given  in  the  following  theorem. 
Theorem  3.5.  IfO<£  6f(d) ,  then  the  subgradient  given  by 


$sd(d)  =  argminH  (3.12) 

^a/(d) 

points  in  the  opposite  direction  of  the  steepest  descent  direction.   That  is, 

g(d)  =  ^W 

IM«)| 

is  the  steepest  descent  direction  of  f  at  d. 

Proof.  A  detailed  proof  is  given  in  Dem'yanov  and  Vasil'ev  (1985).  The  proof  is 
based  on  finding  the  direction  that  gives  the  smallest  directional  derivative  as  given  by 
(3.10).  Q.E.D. 

It  is  obvious  that  Theorem  3.4  and  Theorem  3.5  are  of  great  utility  for  any 
steepest  descent  nondifferentiable  optimization  algorithm. 
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3.2.5.  Expression  for  the  gradient  when  the  maximum  singular  value  is  distinct. 

The  mathematical  background  will  now  focus  on  the  problem  at  hand,  namely 
performing  the  infimization 

info^DMD"1)  (3.13) 

DeB 

The  objective  function  /(d)  =  a(DMD"' )  (where  D  =  diag(d)  and  the  domain  of  the 

objective  function  is  the  positive  orthant  such  that  DeB)  is  convex  as  was  already 

stated.  Latchman  (1986)  has  stated  that  when  the  maximum  singular  value  is  distinct,  the 

gradient  exists  and  is  given  by  a  relatively  simple  expression.     The  following  is  the 

derivation  of  this  expression.    After  defining  M:=  DMD"1  to  simplify  the  notation,  the 

singular  value  decomposition  and  equation  (3.3)  give 

ct2(DMD"')  =  j2(M)  =  y*(M)M*My(M)  (3.14) 

If  it  exists,  the  partial  derivative  of  (3.14)  with  respect  to  the  diagonal  element  d{  of 

matrix  D  is  given  by 

da2(M)  _  a(y*(M)M*My(M)) 
ddi  dd( 

which  by  the  chain  rule  becomes 

— '-  =  -*-i — ^M  My(M)  +  y  (M)M  M^-^ — -  +  y  (M)— -y(M) 

ddt  ddi  dd{  ddt 

Using  (3.3)  gives 


ddt 
which  simplifies  to 


\    ;y(M)  +  y  (M)-^- 
od,  dd. 


+  y  (M)  va,    'y(M) 

od: 
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^I(M)_?(fi)f(l(M)+r(M)5lKM) 


dd, 


dd: 


dd: 


—  2 


=  o-z(M)— -  +  y  (M)    v  yy(M) 


=  y  (M)   x         yy(M) 

da. 


Expanding  the  partial  derivative  term  now  gives 


^«=m£WMl)1Kfi) 


aj; 


do*. 


=  y  (M) 
=  y*(M) 


-u** 5D 


-1     ,    tv-U/t*t\2i 


M  DMD     +D    M  MD  '  +D    M  DM 

dd, 


dD 


dd. 


— UiMD'MD1  +2JD    MEMD  '  -D    M* D2M~Ve 

d?     '  '  J.2     ' 


y(M) 
y(M) 


where  E.  is  a  diagonal  nxn  matrix  with  a  1  in  the  (i,i)  position,  and  zeros  everywhere 
else.  Since  E .  =  ^E,D-1  =  DE;D  /  df  =  tfVD^E,. ,  the  above  equation  becomes 


5^2 


da  (M)  =  J_y*(M)r_EiD-iM*D2MD-i  +2D    MDEiDMD  '  -D    M*D2MD    Ely^M) 

dd,  d:  L  J 


=  |-y*(M)[M*EiM-^2(M)Ei]y(M) 


Using  equation  (3.4)  this  becomes 

da2(M)      2a2(M)r_. 


dd: 


d 

—  2 


[x*  (M)Ei x(M)  -  y *  (M)E, y(M)] 


2a\M)[_    ~    2      _nCKi2"| 


Now, 
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8di  ddi 

such  that  an  expression  for  the  partial  derivative  of  a(DMD')   with  respect  to  the 
diagonal  element  dx  of  matrix  D  is  given  by 
Sct(DMD)      j(DMD') 


c.(DMD-')|2- ^.(DIVID-1)!2 


(3.15) 


dd{  dt 

When  the  maximum  singular  value    a(DMD~')    is  distinct,  then  the  major  output 

principal  direction  x(DMD  ')  and  the  major  input  principal  direction  y(DMD')  are 

determined  by  a  scaling  factor  e)d  of  the  left  singular  vector  x,(DMD_1)  and  right 

singular  vector  y,(DMD  ') ,  respectively.  Therefore,  x,(DMD_1)  and  |y.(DMD_1)  are 

unique  and  the  partial  derivatives  (3.15)  exists  for  i  -  l,2,---,«  such  that  the  gradient  is 
given  by 


1\        -mnyrrv-l\r»-l 


V/(d)  =  Vct(DMD')  =  j(DMD')D 


x(DMD')    -y(DMD"1) 


(3.16) 


where  the  absolute  value  |»|  is  considered  an  element-by-element  operator  when  applied 
to  a  vector.  As  the  preceding  development  has  verified,  when  the  maximum  singular 
value  is  distinct  the  gradient  of  the  objective  function  /(d)  =  <t(DMD_1)  exists  and  is 
given  by  (3.16).  In  addition,  the  subdifferential  is  given  by  (3.1 1).  When  the  maximum 
singular  value  is  repeated  the  gradient  no  longer  exists,  but  it  is  possible  to  determine  the 
subdifferential  and  therefore  a  steepest  descent  direction.  This  is  the  main  theoretical 
result  of  this  paper  and  is  given  in  the  next  section. 
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3.3.  Main  Result  -  Characterization  of  the  Subdifferential  when 
the  Maximum  Singular  Value  is  Repeated 

3.3.1.  General  Expression  for  the  Subdifferential 

When  the  maximum  singular  value  is  repeated  the  major  output  principal 

direction  x(DMD')  and  the  major  input  principal  direction  y(DMD')  are  determined 
by  (3.5)  and  (3.6).  As  such,  the  expression  for  the  gradient  given  by  (3.16)  may  not  be 
uniquely  determined  which  implies  the  objective  function  may  not  be  differentiable. 
When  the  function  is  non-differentiable  then  the  subdifferential  must  be  determined,  as 
opposed  to  the  gradient.  To  characterize  the  subdifferential  define  the  function 


■hr»-l 


V/„(d;u)  =  j(DMD-')D 


£x,(DMD>,. 


;=1 


Xy,(DMD>,. 


(=i 


(3.17) 


where  u  eCr  satisfies  u*u  =  1  and  {x,.(DMD"')}  for  i  -  1,2, •••,r  is  an  orthonormal  set 
of  left  singular  vectors  and  {y^DMD'1)}  for  i  =  1,2, -~tr  is  an  orthonormal  set  of  right 
singular  vectors  corresponding  to  the  maximum  singular  value  j(DMD  ')  of 
multiplicity  r  .  Definition  (3.17)  represents  the  evaluation  of  the  gradient  function  (3.16) 
for  possible  values  of  x(DMD  ')  and  y(DMD').  For  different  u,  the  function 
V/U(d;u)  may  give  different  values,  such  that  the  gradient  is  not  unique  and  is  therefore 
undefined.  The  subdifferential  is  now  characterized  in  the  following  theorem. 

Theorem  3.6.  The  subdifferential  of  the  function  /(d)  =  j(DMD  ' )  is  given  by 

a/(d)  =  conv{V/u(d;u)|u*u  =  1}  (3.18) 

where  V/M  (d;  u)  is  defined  by  (3. 1 7). 

Proof  From  Theorem  3.3,  the  subdifferential  is  given  by 
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df(d)  =  conv{z  &Rn\z  =  lim  Vf(dk),  dk  ->  d,  d,  eD} 

where  D  is  the  set  of  points  where  /(d)  =  <t(DMD~')  is  differentiable  (i.e.,  the 
maximum  singular  value  is  distinct).  The  gradients  Wf(dk)  are  given  by  (3.16),  and  are 
determined  from  the  from  the  sequence  of  major  output  principal  directions 
^(DjMDj1)   and  major  input  principal  directions  y^D^MD^'),  which  are  uniquely 

determined  up  a  multiple  of  e,e .  From  the  perturbation  theory  of  matrices  (Lancaster 
and  Tismenetsky,  1985),  analytic  perturbations  on  normal  matrices  (i.e., 
(DMD  !)*(DMD"'))  have  continuous  eigenvalues  and  eigenvectors  in  a  neighborhood 
of  the  perturbation.  Now,  the  right  singular  vectors  {y;(DMD"')}  are  an  orthonormal 
eigenbasis  of  (DMD-1)* (DMD-1) .  Therefore,  the  right  singular  vectors  are  continuous 
in  D .  This  implies  that  for  points  where  the  maximum  singular  is  non-differential  each 
major  input  principal  direction  y(DMD_1 )  (all  of  which  are  given  by  (3.6))  is  the  limit  of 
a  sequence  of  major  input  principal  directions  yyt(D(lMD^1)  that  correspond  to  a 
maximum  singular  value  that  is  differentiable.  In  addition  the  converse  is  true;  that  a 
sequence  of  major  input  principal  directions  yA.(D/tMD^1)  that  corresponds  to  a 
sequence  of  maximum  singular  values  cr^D^MD^1)  that  are  differential  converges  to 
major  input  principal  direction  y(DMD  ')  given  by  (3.6)  if  the  sequence  ^(D^MD^1) 
converge  to  the  maximum  singular  value  <r(DMD_1) .  Similar  arguments  can  be  made 
for  the  left  singular  vectors/major  output  principal  directions.  Therefore,  for  all  dk  eD 
as  d*->d  then  xt(DtMD^)  -»  x(DMD  )  and  y,(D,MD,-1)  ->  y(DMD')  which 
are  given  by  (3.5)  and  (3.6)  with   u*u  =  l.     In  addition,  for  every  point  where  the 
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maximum  singular  value  is  non-differential  x(DMD  )  and  y(DMD')  are  given  by 
(3.5)  and  (3.6)  and  there  exists  a  sufficiently  small  perturbation  of  d  such  that  there 
exists  sequences  xt(D^MDj')->x(DMD"')  and  y^D^MD^^pMD"1)  which 
are  uniquely  determined  up  a  multiple  of  e'9  such  that  the  gradients  V/(dyt )  exist.  All 
that  is  left  is  to  define  V/u(d;u)  by  (3.17),  which  represents  the  limit  of  V/(dil)  as 
dk  ->  d  for  some  dk  eD,  where  all  u  such  that  u*u  =  1  represents  all  possible  limit 
sequences  dk  — >  d  for  d^  e  D .  Q.E.D. 

Theorem  3.6  is  the  natural  extension  of  the  gradient  result  given  in  Section  3.2.5. 
For  the  case  when  the  maximum  singular  value  is  distinct,  V/u(d;u)  =  V/(d)  for  all 

u*u  =  1  {i.e.,  u  =  u  =  eje)  such  that  d/(d)  =  conv{V/(d)}  =  {V/(d)}  as  expected.  On 
the  other  hand,  when  the  maximum  singular  value  is  repeated  V/(d)  does  not  exists. 
Instead  one  has  V/U(d;u),  which  is  an  extension  of  equation  (3.16)  for  V/"(d),  in  that 
V/tt(d;u)  represents  the  vector  obtain  when  equation  (3.16)  is  evaluated  at  one  of  the 
possible  major  output  and  input  principal  directions  given  by  (3.5)  and  (3.6).  Obviously, 
V/M  (d;  u)  is  a  subgradent,  since  it  is  an  element  of  df  (d) .  In  fact,  V/u  (d;  u)  represents  a 
subgradent  that  is  arbitrarily  close  to  some  Vf(dk)  where  d^  ->  d .  That  is  V/u(d;u) 
for  u*u  =  1  represent  the  boundary  of  df(d) .  Note,  that  a  repeated  maximum  singular 
value  does  not  necessary  guarantee  a  non-differentiability.  Consider  the  matrix  M  =  I 
where  /(d)  =  a(DMD  ' )  =  <7(DID_1 )  =  cr(I)  =  1  with  multiplicity  n  independent  of  d . 
The  function  V/a(d;u)  =  0  for  all  d  and  u,  such  that  df(d)  =  {0}  =  {V/(d)}  where  the 
gradient  exists  and  is  identically  zero.    This  is  an  extreme  case  where  the  set  given  by 
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(3.18)  degenerates  to  a  point.    This  and  other  degenerate  cases  should  be  taken  into 

consideration  when  using  Theorem  3.6. 

3.3.2.  Characterization  of  the  Subdifferential  as  an  Ellipsoid. 

When  the  maximum  singular  value  is  distinct,  the  subdifferential  is  given  by  the 
point  df(d)  -  V/U(d;u)  =  V/(d) ,  where  V/(d)  is  given  in  Section  3.2.5.  The  next  step 
is  to  explore  the  case  when  the  maximum  singular  value  is  repeated  once.  In  this  section, 
it  is  shown  that  df(d)  is  an  ellipsoid  when  the  maximum  singular  value  is  repeated  once 
by  examining  the  properties  of  the  function  V/U(d;u).  To  simplify  the  notation  define 
the  vector  valued  function  g:C2  -^  Rn  as  g(u):=  V/u(d;u)  where  d  is  a  fixed  point 
such  that  maximum  singular  value  /(d)  =  a(DMD"')  has  multiplicity  2.  From 
Theorem  3.6,  this  gives 


5/(d)  =  conv{g(u)|u*u  =  l,u  e C2} 


(3.19) 


To  analyze  (3.19),  g(u)  is  expressed  in  terms  of  u  =  [|w,|e7"~Ul     \u2\ejZ"2  ]T  as 


g(u)  =  H 


cos(Zw,  -Zw2)|w,|w 
sin(Zw,  -Zm2)|w,||w. 


(3.20) 


where  the  elements  of  the  constant  matrix  H  are  given  by 


g(DMD-'),i     |2     I     p, 

h\  = — -, (K.|  -\yn\  ) 


(3.21) 


hi2  =2- — (|x,,|x,.2|cos(Zx,,  -Zx,2)-|v,,|v,.2|cos(Zv,,  -Zv,.2))      (3.22) 


-i 


ff(DMD') 


^•.3  =  ~2"V    y  —  (ki|K-2|sin(Zx,,  -Zxi2)-\yn\\yi2\sm(Zyn  -Zv/2))     (3.23) 


and 
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a(DMD') 


1,4 


(\  I2         I  l2\ 


(3.24) 


for  i  -  l,2,---,«  with 


x.  = 


xu 

xn 

X2\ 

X      = 

22 

_Xnl_ 

_*«2_ 

an  orthonormal  set  of  left  singular  vectors  and 


y.  = 


yn 

yn 

y2i 

» y2  = 

yn 

ym_ 

y»2_ 

(3.25) 


(3.26) 


an  orthonormal  set  of  right  singular  vectors  corresponding  to  the  maximum  singular  value 
cr(DMD_1)  of  multiplicity  2.  Equation  (3.20)  is  obtained  from  equation  (3.17)  when  the 
maximum  singular  value  has  multiplicity  2,  i.e. 

g(u)  =  V/„(d;u)  ^(DMD^D-'IJx.m,  +  x2w2|2  -|ylM,  +  y2w2|2] 
Using  (3.25)  for  {x,,x2}  and  (3.26)  for  {y,,y2} ,  and  considering  one  element  of  g(u) 


the  law  of  cosines  gives 


&(■)  = 


ct(DMD') 


j(DMD') 


tnux\    +2|x/1w1||x/2m2|cos(Z(x;1w1)  -  Z(xi2u2))  +  \xi2u2\     - 

l>Vi«i|2  +2\yiM\\yi2u2\cos{z(y«u\)  -  z(yi2u2))  +  \yi2u2\2} 


Using  trigonometric  and  complex  number  identities  gives 
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s,00  = — 3 (K-il  -\yn\  )KI 

a 

—  (K.l^lcosCZx,  -Zx2)-|x,|x2|cos(Zx,  -Zj,2))cos(ZW,  -Zw2)|Wl||w2 


d 
ct(DMD) 


(k.  Il*«  I  sin(Zx„  -  Zxi2 )  - |  v„  ||  v/2 1  sin(Zyn  -  Zv,2 ))  sin(Zu,  -  Z«2  )|k,  \\u2  |  + 


a(DMD) 


(k2|2  -tal2)!" 


2  .1        12 
2  I 


which  is  of  the  form  (3.20)  where  the  elements  of  H   are  defined  by  (3.21)-(3.24) 

respectively. 

There  are  now  three  cases  to  consider.    The  first  case  is  when  n-2.    This  is  a 

trivial  case,  in  that  the  optimal  similarity  scaling  is  given  by  the  Perron  scaling. 

Therefore,  there  is  no  need  to  further  investigate  the  properties  of  the  subdifferential 

when  n-2  .  The  other  two  cases  are  when  n  =  3  and  when  n  >  3  .  As  will  be  discussed 

shortly,  the  case  when  n  =  3   is  a  degenerate  case  of  the  more  general  case  n  >  3 . 

Therefore  the  case  when  n  >  3  will  be  discussed  next  followed  by  the  case  when  n  =  3 . 

The  first  result  is  that  the  subdifferential  given  by  (3.19)  is  contained  within  an  affine  set 

of  dimension  3.  The  result  is  stated  in  the  following  theorem. 

Theorem   3.7.      For    n>3    and   d    such   that  the  maximum  singular  value 
/(d)  =  j(DMD')  has  multiplicity  2,  the  subdifferential,  df(d),  is  contained  in 

the  affine  set  S  =  {z  e  R"  Pz  =  q}  where  elements  of  the  matrix 


P  = 


Pi.i 

PlA 


A,2 
P2.2 


Pu 

/>2,3 


1        0 

0     1 


/V3.1        Pn-3,2        Pn-3,3        0       0       -        1 


eR 


(»-3)x(n) 
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and  the  vector  q  =  \qx     q2     •••    q„_3]    satisfy 


Pi,\ 

Ka 

K\ 

Pit 

Ki 

Ki 

Pi,3 

K,3 

h2,3 

Qt  . 

K* 

k2A 

3,1 


3,2 


3,3 


KA     -1 


3,4 


-1 

-hM,i" 

~hi+X2 

"A+3,3 

_~HW. 

(3.27) 


for  i  =  \,2,---,n  -  3,  and  where  hLj  are  given  by  (3.21)-(3.24).    Additionally,  the 
dimension  of  S  is  3. 

Proof.  It  is  sufficient  to  show  that  every  g(u)  given  by  (3.20)  with  u*u  =  1  is  an 

element  of  the  affine  set  S  =  {z  e  Rn  Pz  =  q} ,  because  if  a  set  (i.e.,  {g(u)|iTu  =  1} )  is 

contained    within    an    affine    set    (i.e.,     S)    then    convex    hull    of   the    set    (i.e., 
df(d)  -  conv{g(u)|u*u  -  1} )  is  also  contained  within  the  affine  set.  Therefore,  it  must  be 

shown  that  for  all  u*u  =  1 ,  g(u)  satisfies  each  of  the  n  -  3  linear  equations  that  defines 
the  affine  set.  The  first  linear  equation  is 

[Pxa     Phi     Pu     1    0    •••    0]g(u)  =  ?, 
which  must  hold  for  all  u*u  =  1 .  This  becomes 

PuSx  («)  +  Pl2g2  («)  +  /\3#3  («)  +  g4  («)  =  <l\ 

which  from  equation  (3.20)  is  equivalent  to 

P\,\ (K,i \u\ |    +  n\,2  cos(Zw,  -  Zu2 )|m, \\u2 I  +  /?,  3  sin(Ztt,  -  Zu2 )|w, \u2 |  +  hx  4 \u2\  )  + 
P\  2 (K  i \u\ |   +^2  cos(Zmj  -  Zw2 )|m, \u2 I  +  ^23  sin(Zw,  - Zu2 )|w, |w2 1  +  h24 \u2\  )  + 
/?,  3  (/j3  j  Imj  I   +  h32  cos(Zw,  -  Zu2  )j«,  ||w2 1  +  ^33  sin(Zw,  -  Zw2  )|w,  ||w2  |  +  h3  4  |w2 1  )  + 
(/z4 , |w, I    +  h42  cos(Zw,  -  Zu2 )|w, \u2 1  +  /*4  3  sin(Zw,  -  Zu2 )|w, |w2 1  +  h44 \u2\  )  =  qx 


or 
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CPi,Ai  +P\Jh,\  +  P1A.1  +^4,i)|wi|2  + 

(P1A2  +Pi,2h2,2  +  P1A2  +  h42)cos(Zu,  -Zw2)|w1|w2|  + 

(/>lA,3  +Pl,2h2,3  +  P\  A,3  +/*4,3)sin(Zwi   -^W2)|Wl|K|  + 


Now,  equation  (3.27)  gives 


\, 

^2,. 

^3.1 

-\ 

Pi.l 

^.,2 

^2,2 

^3,2 

0 

Pit 

*U 

^2,3 

^3,3 

0 

Pi,l 

A,4 

^2,4 

^3,4 

-1 

Ai 

'1+3,1 


'1+3,2 


-h 


1+3,3 


-h: 


(+3,4 


such  that  (3.28)  becomes 


(ql  -h4]  +h4l)\u,\2  + 

(-/z4  ,  +  hA  2 )  cos(ZM[  -  Zw,  )|w,  \u2 1  + 

(-//4  3  +^43)  sin(Zw,  -  Zw2  )|w,  |m2  |  + 


or 


1    12     1     12      1 
\uA    +\u2\    =1 


(3.28) 


which  holds  for  all  u*u  =  l.  Hence,  for  all  11*11  =  1,  g(u)  satisfies  the  first  linear 
equation  that  defines  S .  In  fact,  the  preceding  arguments  hold  for  all  n  -  3  linear 
equations  that  define  S .    Therefore,  g(u)  is  contained  within  the  affine  set  S  for  all 

u*u  =  1   implying  that  the  subdifferential  is  contained  within  S.     Finally,  the  n-3 

linearly  independent  rows  of  P    are  a  basis  for  the  orthogonal  complement  of  the 

subspace  parallel  to  S  such  that  the  dimension  of  the  affine  set  S  is  3.  Q.E.D. 

Theorem  3.7  implies  that  the  last  n-3  terms  of  g(u)  can  be  expressed  as  an 


affine  functions  of  the  first  3  terms  of  g(u)  such  that  the  subdifferential  df(d)  is  a  3- 
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dimensional  convex  set  in  an  n  -dimensional  space.  This  means  that  the  first  3  terms  of 
g(u)  (i.e.,  {gl(u),g2(u),gi(u)})  describe  df(d).  Therefore,  to  complete  the 
characterization  of  df(d)  it  is  only  necessary  to  investigate  conv{g](u),g2(u),g3(u)} 

for  u*u  =  1 ,  and  then  translate  this  3-dimensional  set  to  the  Rn  using  the  affine  functions 
given  in  Theorem  3.7. 

The  convex  hull,  conv{g1(u),g2(u),g3(u)|u*u  =  1} ,  is  now  shown  to  be  a  3- 
dimensional  ellipsoid,  and  thus  df(d)  is  a  3-dimensional  ellipsoid.  Consider  equation 
(3.20),  even  though  u  =  [|w,|e^"'  |w2|eyZ"2]T  has  4  parameters  (i.e.,  |w,|,  \u2\,  Zux,  and 
Zu2),  the  function  g(u)  with  u*u  =  1  is  a  function  of  only  2  parameters.  One  of  the 
parameters  is  xu  -  \ux  |  and  the  other  is  0U  =  Zux  -  Au2 .    The  reason  \u2 1  is  not  a  third 

parameter  is  that  u*u  =  1  necessarily  requires  \u2 1  =  J\  -  |w, |    .    Now  consider  a  fixed 

value  of  xu ,  the  terms  {g,  (u),g2  (u),g3  (u)}  are  of  the  form 

g,  (u)  =  eu  +  el2  cos((9„ )  +  e13  sin(<9u ) 
g2  (u)  =  e2 ,  +  e22  cos((9„ )  +  e23  sin(6»u ) 

§3  (U)  -  ^3,1   +  e3,2  C0S(°u  )  +  g3,3  Sin(0u  ) 

which  is  obviously  a  parametric  representation  of  a  2-dimensional  ellipse  in  a  3- 
dimensional  space  centered  at  [eu  e2l  e3l]r  .  To  satisfy  u*u  =  1,  \ux\  must  be  an 
element  of  [0,1] .  Therefore,  varying  xu  over  its  admissible  range  of  0  to  1  generates  a 
set  of  ellipses  which  form  the  surface  of  an  ellipsoid.  This  ellipsoid  is  given  by 

E  =  {z  ei?3|(z-c)TB(z-c)  =  1}  (3.29) 

The  center  c  of  the  E  is  given  by 
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c, 

Kl+k\A 

C2 

=  0.5 

h22+h2A 

C3. 

K,2+h3A 

(3.30) 


and  the  matrix  B  which  characterizes  the  length  of  the  axes  of  E  and  its  orientation  has 
the  form 


B  = 


Kt 

Kt 

h> 

Ki 

Ki 

^2,3 

*b 

^2,3 

63,3_ 

(3.31) 


where  the  6  parameters  {^ ,  ,b2  2  ,b3  3 ,6, 2 ,6, 3 ,62  3 }  can  also  be  expressed  in  terms  of  the 
constants  ht  ,'s.    These  expression  can  be  obtained  by  picking  six  different  values  of  u 

with  u*u  =  1 ,  setting  z  =  [g,  (u)    g2  (u)    g3(u)]T  ,  and  then  solving  the  resulting  system 
of  six  linear  equations  in  terms  of  {bu,b22 ,b33 ,bX2 ,bu ,b23}    obtained  from  (3.29). 

Unfortunately  these  expressions  are  vary  cumbersome,  and  therefore  in  practice  it  is 

easier  to  just  solve  the  system  of  six  linear  equations  resulting  from  the  numerical  data  of 

the  particular  problem. 

The   following   theorem   combines   Theorem   3.7    and   the   above   result   that 

{^i (u)' ^2  (u)» S"3 (u)}   wim  u*u  =  1  is  an  ellipsoid  to  give  a  useful  characterization  of 

5/(d) . 

Theorem  3.8.      For    n  >  3    and   d    such   that  the  maximum  singular  value 
/(d)  =  C7(DMD_1)  has  multiplicity  2,  the  subdifferential  df(d)  is  given  by 

a/(d)={zE/?nPz  =  q,    ([z,     z2     z3]-cT)B([z,     z2     z3]T-c)<l} 


where  constants  P  and  q  are  given  by  (3.27)  and  c  and  B  by  (3.30)  and  (3.31). 
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Proof.    The  subdifferential  3/(d)  is  just  the  convex  hull  of  the  3-dimensional 

ellipsoid  E  given  by  (3.29)  translated  to  Rn  by  making  it  an  element  of  the  affine  set 
given  by  Theorem  3.7.  The  convex  hull  of  E  is  the  union  of  itself  and  its  interior  which 
is  given  by  convE  =  {z  e  R3\  (z-  c)TB(z-  c)  <  1} .  Q.E.D. 

To  complete  the  ellipsoidal  characterization  of  df(d)  when  the  maximum 
singular  value  is  repeated  twice  the  case  of  n  -  3  is  now  discussed.  When  n  =  3 ,  g3(u) 
is  an  affine  function  of  g,(u)  and  g2(u),  such  that  a.ffdf(d)  becomes  a  2-dimensional 
plane  in  3-dimensional  space.  The  effect  is  that  the  3-dimensional  ellipsoid  E  is 
degenerate  in  that  it  has  an  axis  of  length  zero,  because  it  is  required  to  be  a  subset  of  a  2- 
dimensional  plane.  Consequently,  convE'  =  E  such  that  df(d)  becomes  a  2-dimensional 
ellipse  including  its  interior  in  a  3-dimensional  space.  Also,  df(d)  has  no  relative 
interior  (i.e.,  there  are  no  elements  of  df(d)  that  are  not  also  on  the  boundary  of  df(d)). 
Finally,  note  that  degenerate  cases  are  possible.  Consider,  the  matrix 
M  =  diag([\  1  0  0]T)  such  that  the  maximum  singular  value  is  repeated  twice.  The 
above  analysis  gives  H  =  0  such  that  equation  (3.27)  is  not  meaningful.  For  this  case 
df(d)  is  no  longer  contained  within  a  3-dimensional  affine  set,  but  is  actually 
df(d)  =  {0}  which  is  a  special  ellipsoid  whose  axis  are  all  length  zero. 

Theorem  3.8  and  the  preceding  paragraph  concerning  the  case  of  n  =  3  give  the 
desired  ellipsoidal  characterization  of  df(d)  when  the  maximum  singular  value  is 
repeated  once.  The  next  logical  set  is  to  extend  the  results  of  this  section  to  the  case 
when  the  maximum  singular  value  is  repeated  more  than  once.  Unfortunately,  the 
preceding  ellipsoidal  characterization  no  longer  holds  and  the  only  characterization  of 
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df(d)  is  that  given  by  Theorem  3.6.  As  is  shown  in  the  next  section,  this  still  has  some 
utility  in  determining  a  steepest  descent  direction. 
3.4.  Determining  the  Steepest  Descent  Direction  and  Conditions  for  a  Minimum 

When  the  maximum  singular  value  is  distinct  the  gradient  exist  and  the  steepest 

descent  direction  is  given  by  -V/(d)  /  |V/(d)| .  Furthermore,  the  necessary  and 
sufficient  condition  for  a  minimum  is  V/(d)  =  0  .  When  the  maximum  singular  value  is 
repeated  the  results  of  the  previous  section  and  Theorem  3.4  and  Theorem  3.5  can  be 
used  in  a  steepest  descent  optimization  algorithm.  First  the  case  when  the  when  the 
maximum  singular  is  repeated  once  is  considered,  because  the  ellipsoidal  characterization 
of  df(d)  results  in  a  convex  optimization  problem  for  determining  the  steepest  descent 
direction.  This  is  followed  by  the  more  general  case  when  the  maximum  singular  value  is 
repeated  more  than  once. 

Using  Theorem  3.5  and  the  ellipsoidal  characterization  of  df(d)  given  by 
Theorem  3.8,  the  subgradient  that  gives  the  steepest  descent  direction  is  now  given  by  the 
optimization 

$rf(d)  =  argmin||$|  (3.32) 

such  that 


P^  =  q  (3.33a) 


and 


([£,     Zi     £3]-cT)B([£,     £2     £3]T-c)<l  (3.33b) 

Optimizaiton  (3.32)  with  constraints  (3.33a)  and  (3.33b)  represent  the  minimum  distance 
from  the  origin  to  the  ellipsoid  df(d) .  Obviously,  the  objective  function  of  optimization 
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(3.32)  is  convex  in  the  n  parameters  {ft, ft, ■•-,£„}  and  the  constraints  (3.33a)  and 
(3.33b)  are  convex  sets.  This  n  -dimensional  optimization  can  be  reduced  to  a  3- 
dimensional  optimization  by  incorporating  the  equality  constraints  (3.32a)  into  the 
objective  function  (3.32),  because  by  Theorem  3.7,  (3.33a)  implies  that  {ft,  ft,  •••,£„} 
are  affine  function  of  {ft  ,ft  ,ft} .  The  optimization  given  by  (3.32)  and  (3.33)  becomes 


i«-3/ 


4sd(d)  =  argmin||4||    =  argmin  £?  +  g  +  fi  +  £."  (&  " P«,ift  ~ PtA  ~ A-,3ft) 


(3.34) 


such  that 

([ft     ft     ft]-cT)B([ft     ft     ft]T-c)<l  (3.35) 

where  the  terms  {ft,  ft,  ••-,£„}  of  ^sd(d)  are  obtain  from  the  affine  functions  of 
{ft, ft, ft}  •  The  objective  function  of  optimization  (3.34)  is  a  positive  semi-definite 
quadratic  function  and  is  therefore  convex.  In  addition,  the  constraint  (3.35)  is  a  convex 
set.  Therefore,  determining  the  steepest  descent  direction  when  the  maximum  singular 
value  is  repeated  once  reduces  to  a  simple  3 -dimensional  convex  quadratic  optimization 
over  a  convex  set.  Finally,  from  Theorem  3.8  the  necessary  and  sufficient  condition  for  a 
minimum,  i.e.  0  e  df(d) ,  reduces  to 

cTBc<l,    q  =  0  (3.36) 

because     [z,     z2     z3]T=0     must    be    an    element    of    the    ellipsoid    and    when 

[z,     z2     z3]T  =  0,  the  terms  {z4,z5,---,z„}  are  zero  only  when  q  =  0  (i.e.,  the  affine  set 

S  =  {z  g  Rn  Pz  =  q}  must  pass  through  the  origin). 

Now  for  the  case  when  the  maximum  singular  value  is  repeated  more  than  once. 
From   Theorem   3.5,   the   steepest   descent  direction   is   obtained   from   the  smallest 
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subgradient  in  the  Euclidean  norm.  From  Theorem  3.6,  all  subgradients  are  given  by  the 
convex  hull  of  V/u(d;u)  for  u*u  =  1,  which  means  every  subgradient  can  be  expressed 
as  the  linear  conbination  AV/1/(d;u,)  +  (l- A)V/u(d;u2)  with  u,*u2,  A  =  [0,1], 
uju,  =  1  and  u2u2=l.  Therefore  the  optimization  problem  given  by  (3.12)  to 
determine  the  subgradient  used  to  obtain  the  steepest  descent  direction  can  be  written  in 
the  form 

4ld(d)  =  argmin||AV/II(d;u1)  +  (l-A)V/II(d;u2)||  (3.37a) 

A,upu2 

with  the  constraints 

u,  *  u2 ,  A  =  [0,1] ,  u*u,  =  1  and  u2u2  =  1  (3.37b) 

Unfortunately,  the  objective  function  of  optimization  (3.37)  is  non-convex  in  the 
components  of  the  complex  vectors  u,  and  u2,  and  therefore  has  all  of  the  associated 
difficulties,  like  local  versus  global  minimums.  In  addition,  from  Theorem  3.4  the 
necessary  and  sufficient  condition  for  a  minimum  is  given  by  \  sd  (d)  =  0 . 

3.5.  Attainability  of  MPDA  when  the  maximum  singular  value  is  repeated 

When  the  maximum  singular  value  is  distinct,  the  necessary  condition  for  a 
infimum  of  (3.13)  is  V/(d)  =  0  where  the  gradient  is  given  by  (3.16).  This  implies  that 
the  moduli  of  the  major  input  and  major  output  principal  directions  are  elementwise 
equal.  Furthermore,  a  unitary  transformation  matrix  U  can  be  determined  that  shifts  the 
angles  of  the  elements  of  the  input  and  output  principal  directions  such  that  MPDA  is 
achieved  and  the  upper  bound  for  y.  is  non-conservative.  In  general  MPDA  is  not 
possible  when  the  maximum  singular  value  is  repeated  and  the  upper  bound  on  ju  given 
by  is  conservative.    Therefore,  the  goal  of  this  section  is  to  determine  the  sufficient 
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conditions  for  which  MPDA  is  attainable  when  the  maximum  singular  value  is  repeated. 
These  conditions  are  important,  because  they  result  in  a  non-conservative  upper  bound 
for  ju . 

A  sufficient  condition  for  attainability  of  MPDA  is  that  there  exist  a  major  input 
and  major  output  principal  direction  pair  with  elementwise  equal  moduli.  This  is 
equivalent  to  the  existence  of  u  such  that  V/u(d;u)  =  0.  In  contrast,  the  less  stringent 
sufficient  condition  for  a  minimum  is  0  e  df(d) ,  where  as  the  condition  V/u(d;u)  =  0  is 
equivalent  to  0  being  an  element  of  the  surface  of  df(d) .  For  the  case  when  the 
maximum  singular  value  has  multiplicity  2  this  becomes  the  condition  that  0  is  on  the 
surface  of  the  ellipsoid.  In  other  words 

cTBc  =  1  (3.38a) 

and 

q  =  0  (3.38b) 

Equations  (3.38a)  and  (3.38b)  represent  the  sufficient  conditions  for  attainability  of 
MPDA  when  the  maximum  singular  value  is  repeated  twice.  When  the  maximum 
singular  value  is  repeated  more  than  once  the  sufficient  condition  for  attainability  of 
MPDA  becomes 

ngnV/tt(d;u)  =  0  (3.39) 

with  u*u  =  1.  Condition  (3.39)  is  not  as  convenient  as  (3.38),  but  is  still  useful  as  a 
method  for  determining  attainability  of  MPDA  and  thus  the  conservatism  of  the  upper 
bound  of  ju . 
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3.6.  Reconciling  the  Results  with  the  PDA  Results 

The  principal  direction  alignment  (PDA)  principle  (Daniel  et  ai,  1986)  states  the 
infimum  of  (3.1)  occurs  at  a  stationary  point  of  the  largest  singular  value  for  which  a 
stationary  point  exists  starting  with  the  maximum  singular  value.  If  the  maximum 
singular  value  is  repeated  then  there  is  no  stationary  point  (the  maximum  singular  value  is 
non-differentiable),  and  an  attempt  is  made  to  find  a  stationary  point  of  the  second  largest 
singular  value,  and  so  on.  This  statement  is  not  entirely  accurate.  Consider  the  case 
when  at  the  infimum,  the  singular  value  is  repeated,  and  therefore  the  gradient  does  not 
exist.  As  such  the  gradient  can  not  be  0  and  there  is  no  stationary  point,  but  it  is  possible 
to  have  a  repeated  maximum  singular  value  and  still  achieve  MPDA  as  demonstrated  by 
Example  3.3.  As  such,  the  infimum  occurs  at  a  non-stationary  point  contradicting  the 
PDA  theory. 

The  PDA  theory  can  rectified  as  follows.  First,  a  more  accurate  statement  than 
stating  the  infimum  occurs  at  a  stationary  point  (i.e.  when  all  the  partials  are  zero)  of  a 
singular  value  is  to  state  that  the  infimum  occurs  at  a  point  where  exist  a  left  and  right 
singular  vector  pair  that  element  wise  equal  moduli.  The  work  of  the  previous  section 
gives  the  conditions  for  under  which  it  is  possible  to  equate  the  moduli  when  a  singular 
value  is  repeated.  If  the  moduli  can  be  equated,  then  MPDA  achieved,  otherwise  it  is 
necessary  to  use  the  PDA  algorithm  by  infimizing  the  next  singular  value. 

3.7.  Examples 

The  following  three  examples  demonstrate  the  results  of  the  previous  sections. 
The  first  example  shows  how  to  determine  the  steepest  descent  direction.  The  second 
example  demonstrates  the  conditions  for  a  minimum.  The  third  example  illustrates  the 
conditions  for  which  MPDA  is  attainable. 
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3.7.1.  Example  3.1. 

Let  M  =  AB*,  where 


A  = 


-0.1 582  -  0.3074/  0.3252  +  0.3078/ 

0.4198-0.5890/  0.0843-0.0067/ 

0.2182  +  0.0182/  0.7031  +  0.1455/ 

0.1039-0.4891/  -0.4090  +  0.0507/ 

-0.0765  +  0.23 1 5/  -0.3256  -  0.0301/ 


and 


B  = 


-0.3681  -  0.3181/  0.2366  +  0.271 1/ 

-0.2708  +  0.0371/  0.0536  +  0.3304/ 

-0.4548  +  0.5280/  -0.093 1  -  0.2255/ 

0.3127-0.1501/  -0.0013-0.0917/ 

0.2842  +  0.0444/  0.8244  +  0.1 044/ 


In  performing  the  infimization  infcr(DMD  ),  consider  the  point  d  =  [l  1  1  1  1] 
corresponding  to  D  =  I.  The  maximum  singular  value  a(DMD  ')  =  <r(M)  is  repeated 
(i.e.,  cr1(M)  =  o-2(M)  =  l,  with  cr3(M)  =  <r4(M)  =  a5(M)  =  0).  Therefore,  the 
objective  function  /(d)  =  <t(DMD_1)  is  non-differentiable  at  d  =  [l  1  1  1  1]T  and 
the  results  of  the  this  chapter  are  used  to  efficiently  solve  the  optimization  by  either 
determining  a  steepest  descent  direction  from  the  point  d  =  [1  1  1  1  1]T  or  by 
determining  if  the  point  satisfies  the  optimality  and  MPDA  conditions. 

First,  the  ellipsoidal  characterization  of  the  subdifferential  is  obtained  using  the 
method  of  Section  3.3.2.  An  orthonormal  set  of  right  singular  vectors  corresponding  to 
the  repeated  maximum  singular  value  is 


{x,,x2} 
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0.1106-0.16791 

-0.1893-0.4442/ 

0.3526-0.6134/ 

-0.2971  +  0.1422/ 
-0.0002  +  0.3428/ 


0.5229  +  0.0786/ 
0.0843  +  0.5385/ 
0.1979-0.1544/ 
0.0567  +  0.5552/ 
-0.2160-0.0469/ 


and  an  orthonormal  set  of  left  singular  vectors  corresponding  to  the  repeated  maximum 
singular  value  is 


{y.>y2}  =  < 


0.0000 +  0.0000i 
0.2151  +  0.1955i 
-0.0322 +  0.3960i 
-0.0690  -0.2280i 
0.3825 -0.7447i 


0.6051 +  0.0000i" 
0.3141  -0.0597i 
-0.1384 -0.6067i 
-0.1529  +  0.2204i 
0.1730  -0.2060i 


Using  these  sets  of  left  and  right  singular  vectors  and  equations  (3.21)-(3.24)  gives 

0.0404  0.0893  -0.1930  -0.0865" 

0.1486  -0.6221  -0.0195  0.1949 

Fi:=    0.3427  0.8006  0.0148  -0.3243 

0.0517  0.2036  0.2458  0.2395 

-0.5834  -0.4713  -0.0481  -0.0235 

From  Theorem  3.8  the  ellipsoidal  characterization  of  the  subdifferential  is  given  by 

a/(d)  =  {zGJRn|Pz  =  q,    ([z,     z2     z3]-cT)B([z,     z2     z3]T-c)<l} 

where  the  elements  of  the  matrix 


P  = 


1.2180     0.6214    0.0927     1.0000     0.0000 
-0.2180    0.3786    0.9073    0.0000     1.0000 


and  the  vector 


0.2251 
-0.2251 


are  obtained  from  (3.27),  the  matrix 
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B  = 


106.6882  -13.8175  -21.7949 
-13.8175  32.1680  17.6270 
-21.7949     17.6270      15.3504 


is  obtained  by  the  method  mentioned  after  equation  (3.31),  and  the  vector 

-0.023  f 

0.1718 
0.0092 

is  given  by  (3.30).  The  point  d  =  [l  111  1]T  is  obviously  not  optimal,  because  the 
necessary  optimality  condition  q  =  0  of  (3.36)  is  not  satisfied.  Consequently,  MPDA 
does  not  hold  either.  Therefore,  the  next  step  is  to  find  a  steepest  descent  direction  in 
order  to  decrease  the  objective  function  in  the  next  step  of  an  iterative  optimization 
algorithm.  The  subgradient  that  gives  the  steepest  descent  direction  is  obtained  by 
solving  the  simple  3-parameter  optimization  given  by  (3.34)  and  (3.35)  and  is  determined 
to  be 


Ssd(d)  = 


0.0264 
0.1069 
-0.0795 
0.1338 
-0.1877 


Finally,  the  steepest  descent  direction  is 


g(d) 


-0.0988 
-0.3996 
0.2970 
-0.5002 
0.7015 
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3.7.2.  Example  3.2. 

The  following  example  is  taken  from  Daniel  et  al.  (1986).  Let  M  =  AB* ,  where 


A  = 


0.6501 2  +  0.00000/  0.00000  +  0.00000/ 

0.45970  +  0.00000/  0.45970  +  0.00000/ 

0.45970  +  0.00000/  0.00000  +  0.45970/ 

-0.39322  +  0.00000/  -0.53729  +  0.53729/ 


and 


B 


0.00000  +  0.00000/ 
0.45970  +  0.00000/ 
0.45970  +  0.00000/ 
0.53729-0.53729/ 


0.65012  +  0.00000/ 

-0.45970  +  0.00000/ 

0.00000-0.45970/ 

0.39332  +  0.00000/ 


Again,   in  performing  the   infimization    infcr(DMD  ')    the  point    d  =  [l     1    1    1]T 


corresponding  to  D  =  I  has  a  maximum  singular  value  <r(DMD~  )  =  ^(M)  that  is 
repeated  {i.e.,  cr, (M)  =  cr2 (M)  =  1 ,  with  cr3(M)  =  cr4(M)  =  0).  Therefore,  the 
objective  function  /(d)  =  <t(DMD_1)  is  non-differentiable  at  d  =  [1  1  1  1]T  and  the 
results  of  the  this  chapter  are  used  to  solve  the  optimization  by  either  determining  a 
steepest  descent  direction  from  the  point  d  =  [1  1  1  1]T  or  by  determining  if  the  point 
satisfies  the  optimality  and  MPDA  conditions. 

The  ellipsoidal  characterization  of  the  subdifferential  is  given  by 

d/(d)  =  {zGiHPz  =  q,    ([z,     z2     z3]-cT)B([z,     z2     z3]T-c)<l} 


where 


P  =  [l     1     1     l] 
q  =  0 
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B  = 


3.5575  1.0750  0.0000" 
1.0750  6.0773  1.0750 
0.0000     1.0750     3.5576 


and 


0.0548 
-0.0749 
0.0548 


The  point  d  =  [1  1  1  1]T  is  optimal,  because  the  necessary  optimality 
conditions  q  =  0  and  cTBc  =  0.0378  <  1  of  (3.36)  are  satisfied  implying  0  e  df(d) .  This 
means  the  upper  bound  inf<T(DMD_1)  is  1.0000.  On  the  other  hand,  the  MPDA 
attainability  condition  cTBc  =  1  is  not  satisfied.  Therefore,  MPDA  is  not  attainable  and 
the  upper  bound  is  conservative,  i.e.  /j(M)  <  inf  <r(DMD_1)  =  1 ,  and  either  the  principal 
direction  alignment  (PDA)  method  proposed  in  Daniel  et  al.  (1986)  or  a  direct  attempt  at 
solving  the  lower  bound  sup/?(MU)  must  by  used  to  obtain  an  exact  value  of  the 
structured  singular  value. 
3.7.3.  Example  3.3. 

This  last  example  shows  that  even  though  the  maximum  singular  value  is  repeated 
at  the  optimum  it  may  still  be  possible  to  attain  MPDA  and  thus  eliminate  the 
conservatism  in  the  upper  bound  of  jj.  .  Consider  the  matrix 


■0.0274  +  0.2253/  -0.0622  +  0.0571/  -0.0597  +  0.0705/  -0.0147  +  0.0149/  0.1624-0.1333/ 

0.2201-0.2277/  0.2355-0.0394/  0.1303  +  0.0643/  -0.0632  +  0.1792/  -0.3688-0.1437/ 

M  -     -0.4758  +  0.2550/  -0.1977-0.1981/  -0.1025  +  0.0008/  0.1533  +  0.1583/  0.2666  +  0.1838/ 

0.1192-0.0574/  -0.2418-0.0274/  0.1239-0.2037/  0.3778-0.3278/  -0.0824  +  0.3762/ 

-0.0974-0.3482/  0.1610  +  0.1308/  -0.1589-0.0976/  -0.4272-0.1706/  0.1610  +  0.0723/ 
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The  point  d  =  [1  1  1  1  1]T  corresponding  to  D  =  I  has  a  maximum  singular  value 
CT(DMD"')  =  a(M)  that  is  repeated  (i.e.,  cr,  (M)  =  cr2  (M)  =  1 ,  with 
o-3  (M)  =  <x4(M)  =  cr5(M)  =  0).  Therefore,  the  objective  function  /(d)  =  a(DMD')  is 
non-differentiable  at  d  =  [1  1  1  1  1]T  and  the  results  of  the  this  chapter  are  used  to 
efficiently  solve  the  optimization  by  either  determining  a  steepest  descent  direction  from 
the  point  d  =  [l  111  1]T  or  by  determining  if  the  point  satisfies  the  optimality  and 
MPDA  conditions. 

The  ellipsoidal  characterization  of  the  subdifferential  is  given  by 

d/(d)  =  {zeJRnPz  =  q,    ([z,     z2     z3]-cT)B([z1     z2     z3]T-c)<l} 


where 


P  = 


0.0828    -0.9106    0.4221     1.0000     0.0000" 
0.9172     1.9106     0.5779    0.0000     1.0000 


0.0000 
0.0000 


B  = 


37.4471  -7.5773  25.1289 
-7.5773  28.1213  -6.3079 
25.1289     -6.3079     26.8994 


and 


c  = 


-0.2398 
0.0632 
0.2010 


The  point   d  =  [1     1     1     1    1]T   is  optimal,  because  it  satisfies  the  necessary 


optimality  conditions  (3.36).   Furthermore,  the  MPDA  attainability  conditions  (3.38)  are 
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also  satisfied,  namely  q  =  0  and  cTBc  =  1 .  Therefore,  the  upper  bound 
inf <r(DMD-1)  =  1.0000  is  tight  and  the  structured  singular  value  is  exactly 
ju(M)  =  1.0000  even  though  the  maximum  singular  value  is  repeated  such  that  the 
objective  function  is  nondifferentiable. 

3.8.  Conclusions 

The  MPDA  principle  approach  to  solving  the  structured  singular  value  problem  is 
investigated.  In  the  infimization  that  gives  an  upper  bound  to  mu,  a  repeated  maximum 
singular  value  results  in  a  non-differentiablity  of  the  objective  function.  Therefore, 
efficient  gradient  descent  optimization  algorithms  that  use  the  analytical  expression  for 
the  gradient  must  be  modified.  The  first  result  of  this  paper  is  characterization  of  the 
subdifferential  which  represents  the  set  of  all  sub-gradients  or  generalized  gradients.  In 
addition,  for  the  case  of  a  once  repeated  maximum  singular  value  it  is  shown  that  the 
subdifferential  is  in  fact  a  3-dimensional  ellipsoid  in  and  ^-dimensional  space.  Using 
results  from  non-differential  optimization  theory,  the  steepest  descent  direction  is  obtain 
from  this  characterization  of  the  subdifferential  to  facilitate  the  optimization. 
Furthermore,  conditions  for  optimality  are  presented  which  are  based  zero  being  an 
element  of  the  subdifferential.  Finally,  attainability  of  MPDA  at  the  optimum  is  shown  to 
be  equivalent  to  zero  being  on  the  boundary  of  the  subdifferential  enhancing  the  PDA 
results  when  themaximum  singular  value  is  repeated. 


CHAPTER  4 
SPECTRAL  RADIUS  -  MAXIMUM  SINGULAR  VALUE  EQUIVALENCE  UNDER 

OPTIMAL  SIMILARITY  SCALING 

4.1.  Introduction 

It  is  well  known  that  the  maximum  singular  value  of  a  matrix  is  an  upper  bound  of 

the  spectral  radius  (i.e.,  p(M)  <  cr(M)  where  M  eC"x").  Determining  the  conditions 
under  which  the  upper  bound  is  attained  is  a  significant  issue  in  the  field  of  robust 
control.  One  approach  is  to  seek  properties  of  matrices  that  are  necessary  and  sufficient 
for  equality  of  the  spectral  radius  and  the  maximum  singular  value.  Another  approach 
uses  optimization  to  condition  the  matrix  through  similarity  and  unitary  transformations 
in  order  to  increase  the  spectral  radius  and  decrease  the  maximum  singular  value  upper 
bound  so  that  equality  is  achieved. 

Previous  work  deals  with  the  optimal  conditioning  of  matrices  from  a  numerical 
accuracy  stand  point  (Bauer,  1963)  and  focuses  on  similarity  transformations  using 
nonnegative  diagonal  matrices.  The  scaling  problem  for  non-negative  matrices  yields  a 
very  elegant  and  precise  result.  It  provides  a  closed  form  expression  for  the  optimal 
similarity  scaling  matrix  for  which  the  Perron-root  (largest  positive  eigenvalue  of  a 
positive  matrix)  equals  the  least  upper  bound  subordinate  to  an  absolute  norm.  In 
addition  there  are  analytical  expressions  for  the  elements  of  the  optimal  diagonal  matrix 
that  involve  the  Perron-eigenvectors  of  the  given  positive  matrix  (Stoer  and  Witzgall, 
1962).  The  relationship  to  the  present  work  is  that  the  least  upper  bound  of  the  matrix 
subordinate  to  the  Euclidean  norm  is  the  maximum  singular  value  of  a  matrix.    The 


62 


63 

previous  results  are  based  on  earlier  work  that  derive  a  necessary  condition  for  the  least 
upper  bound  of  a  matrix  to  equal  the  modulus  of  an  eigenvalue  of  the  matrix,  namely,  that 
the  corresponding  right  and  left  eigenvector  are  dual  (Bauer  1962).  Unfortunately,  for 
the  general  case  of  complex  matrices,  there  are  no  equivalent  analytical  results  on  optimal 
scaling  by  positive  diagonal  matrices,  although  there  exist  several  numerical  algorithms. 
From  a  robust  control  perspective,  the  structured  singular  value,  fi  (defined  as 

supp(MU)  where  V:=\diag(ej0\ej°2  ,---,ej0")\o<0i  <2/r,i  =  \,2,---,n)  (Doyle,  1982)), 

is  a  widely  accepted  tool  in  the  robust  analysis  of  linear  systems.  It  considers  the 
problem  of  robust  stability  for  a  known  plant  subject  to  a  block-diagonal  uncertainty 
structure  under  feedback.  In  general,  any  block-diagram  interconnection  of  systems  and 
uncertainties  can  be  rearranged  into  the  block-diagonal  standard  form.  Calculating  /i  is 
not  trivial;  in  fact  the  problem  has  been  proven  to  be  NP-hard  (Braatz  et  al.,  1994).  The 
difficulty  is  that  the  spectral  radius  is  non-convex  over  the  set  of  unitary  matrix 
transformations.  One  approach  is  to  consider  upper  bounds  for  the  spectral  radius  that 
can  be  calculated  easily,  and  ideally  should  be  attainable  to  eliminate  conservatism.  The 
maximum  singular  value  is  reasonable  choice  for  an  upper  bound  because  it  is  invariant 
under  unitary  matrix  transformations.  In  addition,  the  maximum  singular  value  upper 
bound  can  be  decreased  by  optimizing  over  similarity  transformations  because  the 
spectral  radius  is  invariant  under  such  transformations.  Ultimately,  the  problem  becomes 
one  of  conditioning  a  matrix  through  optimal  similarity  and  unitary  transformations  to 
achieve  equality  between  the  spectral  radius  and  the  maximum  singular  value. 

In    addressing    the    existence    of   solutions    to    the    proposed    optimization, 
Kouvaritakis  and  Latchman  introduce  the  major  principal  direction  alignment  (MPDA) 
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property  (1985).  The  result  states  that  the  spectral  radius  of  a  matrix  is  equal  to  the 
maximum  singular  value  of  the  matrix  if  and  only  if  a  major  input  principle-direction  and 
a  major  output  principal-direction  of  the  matrix  are  aligned.  MPDA  is  a  strict  condition 
for  a  matrix,  but  can  be  used  to  determine  the  optimal  positive  diagonal  matrix  and 
unitary  matrix  that  results  in  equality  between  the  afore  mentioned  definition  of  ju  and 
the  maximum  singular  value  upper  bound  for  the  case  when  the  maximum  singular  value 
is  distinct. 

It  is  the  goal  of  this  work  to  establish  relationships  between  results  obtained  from 
different  perspectives  of  the  same  spectral-radius/maximum-singular-value  equivalence 
problem.  To  this  end,  the  earlier  work  by  Bauer  (1963)  on  positive  matrices  is  extended 
to  the  class  of  general  complex  matrices.  The  results  are  necessary  conditions  for 
equality  that  are  used  to  improve  the  calculation  of  ju  through  its  upper  bound. 

4.2.  Mathematical  Background 
4.2.1.  Dual  Norms  and  Dual  Vectors 

In  the  theoretical  development  that  follows  the  mathematical  concepts  of  dual 
norms  and  dual  vectors  are  utilized.  These  concepts  are  explained  in  a  paper  by  Bauer 
(1962)  and  are  reviewed  here  to  facilitate  the  theoretical  development.  Given  a  vector 
norm  |||  its  dual  vector  norm  ||-|D  is  defined  as 

ii  ii  _  Rey*x 

y  L  :  =  max  Re  y  x  =  max  — ^ — 

D       Hl=1  H*°     |x| 

For  such  dual  norms  the  Holder  inequality 

||y|D|x[>Rey*x 
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holds  an  is  sharp,  i.e.,  for  any  y0  there  exists  at  least  one  x0 ,  and  for  any  x0  there  exists 
at  least  one  y0  such  that  the  equality  holds  (Bauer,  1962).  If  such  a  pair  (x0,y0)  with 
|y0||D|x0|  =  Rey^XQ  also  satisfies  the  scaling  condition 

IMUKH 

it  is  called  a  dual  pair.    Note  that  the  dual  vector  of  x  is  often  written  (x)D.  A  pair 

(x0,y0)    is  strictly  dual  and  is  written   y0|Dx0    if  ||y0||D||x0||  =  y*x0  =  1-     For  strictly 

homogenous  norms  {i.e.,  those  satisfying  |«x|  =  \a\ |x|  for  all  complex  scalars  a)  the 
Holder  inequality  may  be  sharpened  to  (Bauer,  1962) 

|yUlxNy*x 

For     a     dual     pair      (x0,y0)      under     a     homogenous     norm     it     follows     that 

ReyoX0  =||y0||D|x0|>  y*x0    which  implies  that  ReyoX0  =  y*x0.     Hence,  for  a  strictly 

homogenous  norm  every  pair  of  dual  vectors  (x0,y0)  is  also  strictly  dual  pair.  In 
addition,  there  exists  a  strict  dual  y0  for  any  x0  *  0  and  a  strict  dual  x0  for  any  y0  *  0 . 

In  general,  the  dual  norm  of  a  p-norm   |x||    :=  (/J*/!')' 'P ,  is  the  associated 

p-norm  | J  ,  where  1  /  p  + 1  /  q  =  1 .  So  the  infinity-norm  and  the  1  -norm  are  duals,  and  the 

dual  norm  of  the  2  (Euclidean)  norm  is  itself.   For  the  2-norm,  a  pair  (x0,y0)  is  dual  if 

yo  =  x0/|M2- 

4.2.2.  Positive  Matrix  Result 

Early  work  on  determining  when  the  spectral  radius  equals  the  maximum  singular 
value  is  concerned  with  positive  matrices  transformed  by  non-negative  diagonal  matrices, 
because  they  have  good  numerical  properties  {i.e.,  less  round  off  errors)  and  therefore 


66 

may  be  used  for  conditioning  of  matrices.  In  addition,  positive  matrices  remain  positive 
under  transformation  by  non-negative  diagonal  matrices  leading  to  connections  with 
Perron-roots    7r(P)    (positive   eigenvalues   of  largest   modulus)   of  positive  matrices 

P  €/?"*"  (note,  R+  is  the  set  of  positive  real  numbers).  From  this  perspective,  Stoer  and 
Witzgall  (1962)  show  that  for  the  positive  matrix  P  and  non-negative  diagonal 
matrices  D 

^■(P)  =  minlub(D"1PD)  (4.1) 

where  ED:  =  {diag(i/, ,  d2 ,  •  •  • ,  dn  )|  di  >  0,  /  =  1,2,  •  •  • , «} ,  and 


Ax  „      „ 

lub(A)  :=  max  „  „"  =  max  Ax 


is  the  least  upper  bound  norm  of  a  matrix  A  e  C*"  subordinate  to  the  vector  norm   •  .  It 

is  noted  that  the  least  upper  bound  norm  is  equivalent  to  the  induced  matrix  norm,  and 
that  when  the  subordinating  norm  is  the  Euclidean  norm  then  lub(A)  =  <r(A)  . 

In  developing  the  result  it  is  necessary  to  make  use  of  a  result  from  Bauer  (1962) 
that  states  that  if  X  is  an  eigenvalue  of  A ,  then 

\X\  =  lub(A)  (4.2) 

is  only  possible  if  a  right  and  left  X  -eigenvector  are  dual  with  respect  to  the  norm  to 
which  the  bound  norm  is  subordinate,  where  by  definition  a  left  X  -eigenvector  w  of  A 
satisfies  the  relation  w*A  =  Xw*  (Golub  &  Van  Loan,  1983;  Isaacson  &  Keller,  1966; 
Stewart,  1970).  The  reader  is  cautioned  that  some  authors  use  the  term  left  eigenvector 
for  an  eigenvector  of  AT . 
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Let  D0  be  the  minimizing  D  of  (4.1),  then 

*(P)  =  ;r(D0PD0)  =  lub(D0-'PD0)  (4.3) 

From  the  result  of  Bauer  (1962),  (4.3)  is  only  possible  if  the  right  and  left  eigenvectors  of 
Do'PD0  are  dual.  Now,  if  v  >  0  and  w  >  0  are  the  right  and  left  Perron  vectors  of  P 
(note  that  it  is  implied  that  greater  than  operator  ">"  is  an  element  wise  operation  when 
applied  to  a  vector),  then  it  is  straightforward  to  show  that  D~'v  >  0  and  Dw  >  0  are  the 
right  and  left  Perron  vectors  of  D_1PD .  Therefore,  any  D0  that  minimizes  (4.1)  such  that 
equality  is  achieved  must  also  make  the  vectors  D~'v  and  D0w  dual,  where  v  >  0  and 
w  >  0  are  the  right  and  left  Perron  vectors  of  P . 

The  problem  now  reduces  to  transforming  the  positive  vectors  v  >  0  and  w  >  0 
to  dual  vectors  D~'v  and  D0w  where  D0  is  a  non-negative  diagonal  matrix.  Stoer  and 
Witzgall  (1962)  state  that  for  absolute  norms  (i.e.,  norms  that  only  depend  on  the  moduli 
of  their  components  (Bauer  et  al.,  1961))  there  exists  one,  and  up  to  positive  multiples 
only  one,  non-singular  non-negative  diagonal  matrix  D0  such  that  D~'v  and  D0w  forma 
dual  pair.  For  a  p-norms  which  are  necessarily  absolute  norms,  the  positive  vectors  y  >  0 
and  x  >  0  are  dual  if 

(y,r  =  (x,y,  *  =  i,2,-,/i 

and 

p    q 
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Therefore,  the  matrix 

makes  D~'v  and  D0w  a  dual  pair  for  any  right  and  left  Perron-eigenvectors  v  >  0  and 
w>0. 

Duality  is  only  a  necessary  condition  for  (4.2).   Therefore,  to  show  (4.1)  holds  it 

suffices  to  show  (4.3)  holds  for  those  matrices  D~'PD0  whose  right  and  left  Perron- 
vectors  Dq'v  and  D0w  are  dual,  where  D0  is  given  by  (4.4).  Using  the  definitions  of 
eigenvalues  and  eigenvectors  it  can  be  shown  that 

Re{(wX)(Do,PDo)(Do1v)}  =  ^(D-1PD0)Re{(wX)(Do1v)}  (4-5) 

and  from  the  definition  of  duality  of  vectors  it  is  true  that 

Re{(w*D0)(lVv)} 


IMId 


Do'v 


=  1  (4.6) 


Combining  (4.5)  and  (4.6)  gives 

Re{(wtD0)(D-1PD0)(D-1v)} 


IId  wll   d_1v 

II    o     lb      ° 


=  ^(D"1PD0)  (4.7) 


Using  of  the  bilinear  characterization  of  the  least  upper  bound 

Re{y*Axj 
lub(A)  :=  max    „  L„  „  „ J  (4.8) 

x^°  |y|LHI 

Stoer  and  Witzgal  (1962)  show  there  is  a  maximizing  pair  for  (4.8)  in  the  positive 
orthant,  and  that  the  only  maximizing  pair  in  the  positive  orthant  for  lub(Do'PD0)  is  the 
pair  Dq'v  and  D0w  such  that 
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lubCD-'PDo)  =  — l      „  °     „°„ ,,     -      J 

D  w     D_,v 
II    o    liDir  o    || 

which  from  (4.7)  equals  7r(D~lY*D0) .  Therefore,  (4.1)  holds  where  the  minimizing  D  is 
given  by  (4.4). 

The  relationship  of  Stoer  and  Witzgall's  positive  matrix  result  to  the  spectral- 
radius/maximum-singular-value  problem  can  be  shown  by  specifying  the  least  upper 
bound  norm  to  be  subordinate  to  the  Euclidean  norm,  i.e. 

1ud(A)  =  ct(A)  (4.9) 

where  A  e  C"x" .  Combining  (4.9)  and  the  fact  that  the  Perron-root  of  a  positive  matrix  is 
the  spectral  radius,  (4.1)  becomes 

p(P)  =  mino:(D"1PD)  (4.10) 

DdD> 

for  positive  matrices  P  and  positive  diagonal  matrices  D .   In  addition,  from  (4.4),  there 
is  an  analytical  expression  for  the  optimizing  D0  given  by 


D0  =  diag 


(yf    v?         v'/2^ 


vwf'wf     'wf, 


(4.11) 


where  v  >  0  and  w  >  0  are  right  and  left  Perron- vectors  of  P .  Clearly,  (4.10)  shows  that 
for  positive  matrices  there  is  a  simple  similarity  transformation  for  which  the  spectral 
radius  attains  its  the  maximum  singular  value  upper  bound. 
4.2.3.  Major  Principal  Direction  Alignment  Property 

In  solving  various  robust  control  problems  it  is  necessary  to  determine  the 
conditions  under  which  the  spectral  radius  of  a  matrix  attains  its  maximum  singular  value 
upper  bound.    The  major  principal  direction  alignment  (MPDA)  property  addresses  this 
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problem  (Kouvaritakis  and  Latchman,  1985).  Consider  the  singular  value  decomposition 
of  a  square  matrix  A  €  C"""1  given  by 

A  =  X(A)Z(A)Y*(A) 
where  S(A)  is  the  diagonal  matrix  of  singular  values  placed  in  descending  order,  and 
X(A)  and  Y*(A)  are  unitary  matrices  whose  columns  are  the  respective  output  and 
input  principal  directions  of  A ,  arranged  in  an  order  conformal  with  the  order  of  the 
singular  values  (Lancaster  and  Tismenetsky,  1985).  Now,  define  a  major  input  principal 
direction  y(A) ,  and  a  major  output  principal  direction  x(A) ,  of  a  matrix  A  respectively 
as  normalized  input  and  output  principal  directions,  corresponding  to  the  maximum 
singular  value,  <r(  A)  of  A .  The  MPDA  property  is  given  in  the  following  theorem. 

Theorem  4.1.  The  spectral  radius  of  any  matrix  AeCn*n  is  equal  to  the 
maximum  singular  value  of  A,  if  and  only  if  there  exists  a  major  input  principal 
direction  and  a  major  output  principal  direction  of  A  which  are  aligned  such 
that 

x{A)  =  ejey(A) 
Proof   Given  by  Kouvaritakis  and  Latchman  (1985).   An  alternative  proof  based 
on  dual  norms  and  dual  vectors  is  given  in  Chapter  2.  Q.E.D. 

4.2.4.  MPDA  as  a  Control  Theory  Application 

One  area  in  the  field  of  robust  control  that  makes  use  of  the  spectral- 
radius/maximum-singular-value  equivalence  problem  is  the  stability  analysis  of 
multivariable  feedback  systems  in  the  presence  of  structured  uncertainties.  Of  particular 
interest  is  the  stability  of  diagonally  perturbed  systems  for  which  the  uncertainty  is 
represented  by  the  complex  diagonal  matrix 
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A  ediag(£pJ2, •••,<?„),  \St\<Pn  Pt  eR+,  i  =  l2,~,n 

This  class  of  systems  is  especially  amenable  to  spectral  radius-preserving  similarity 
scaling,  and  through  simple  transformations  is  representative  of  the  more  general  class  of 
full  structured  uncertainties. 

Using  Nyquist  arguments  in  the  complex  plane,  it  can  be  shown  that 

supp(MA)<l  (4.12) 

A 

is  a  necessary  and  sufficient  stability  condition,  where  the  complex  matrix  M  is  function 
of  the  system's  transfer  function  matrix  evaluated  a  particular  frequency.  The 
optimization  problem  (4.12)  is  non-convex,  but  it  can  be  simplified  by  introducing  the 
following  positive  diagonal  similarity  scaling 

p(MA)  =  p(D   MDA)  <  ct(D   MDA) 
Furthermore,  using  geometric  arguments  based  on  the  MPDA  principle,  it  can  be  shown 
that  the  supermizing  diagonal-matrix  Aopt  has  the  form 

KPt  =  QU 

where  Q  =  diag(#, ,  q2 ,  •  •  • ,  qn )  with  qi  e  R+  and 

U  eV:=[diag(eje\eJ02 ,---,ej9")\0<ei  <2tt,/  =  l,2,---,«} 

The  optimization  problem  (4.12)  becomes  equivalent  to 

sup/?(MA)  =  supp(MQU)  <  inf  a(D 'MQD)  (4.13) 

A  V&J  Dero 

and  the  necessary  and  sufficient  stability  condition  becomes 

inf  ^(D-'MQD)  <  1  (4.14) 

DsD 
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Furthermore,  using  MPDA  arguments  it  can  be  shown  that  the  optimizing  D0  in  (4.14) 
results  in  the  equality 

supp(MQU)  =  ^(D-'MQDo) 

Veu 

when  the  maximum  singular  value  is  distinct  at  the  infimum. 

4.3.  Main  Result  -  Extension  of  the  Positive  Matrix  Result  to 
General  Complex  Matrices 

The  positive  matrix  result  of  Stoer  and  Witzgall  as  stated  by  (4.1)  and  specialized 
to  the  Euclidean  norm  by  (4.10)  gives  a  positive  diagonal  similarity  scaling  (4.11)  that 
results  in  equality  of  the  spectral  radius  and  maximum  singular  value  of  a  positive  matrix. 
When  applied  to  robust  control  problems  that  involve  complex  matrices,  the  positive 
matrix  result  is  usually  only  sub-optimal.  Therefore,  it  is  necessary  to  extend  the  result  to 
the  class  of  complex  matrices.  Unfortunately,  much  of  the  theoretical  development  is 
dependent  on  the  characteristic  properties  of  positive  matrices.  Therefore,  when 
generalizing  the  result  to  complex  matrices  it  is  not  possible  to  explicitly  state  that  there 
exists  a  similarity  scaling  that  will  result  in  equality  of  the  spectral  radius  and  maximum 
singular  value  of  a  matrix.  Nevertheless,  it  is  possible  determine  the  necessary  conditions 
for  the  existence  of  a  positive  diagonal  similarity  scaling  that  leads  to  equality.  The  result 
is  given  in  the  following  theorem. 

Theorem  4.2.  Let  A  e  Cnxn  have  a  right  eigenvector  v  and  a  left  eigenvector  w 
associated    with    an    eigenvalue     A(A)     of   maximum    modulus    such    that 

|A( A)|  =  p( A) ,  and  let  Do  =  diag(d0l,d02,---,dOn)  be  define  as 

D0:=argmino:(D~1AD) 

De© 


where  B:  =  {diag(rf,  ,d2 , •  •  •  ,dn)\  d{  >  0,  i  =  1,2, •  •  -,n} .   Then  if 
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/o(A)  =  mino:(D"IAD) 

DeHJ 


the  following  three  conditions  hold 
i) 

[d20l     <2     ■■■    <„]Tenull(N)  =  ker(N) 
where 


N  = 


VAWA    -\WA  VAW2 


NK 


V2  Pi  V2W2       -W2\ 


K  Pi  K  p2 


V2  PJ 


ii) 

and  either 
iii-a) 


arg(v,.)  =  arg(w,.),    /  =  1,2,  •-,« 


I 

1=1 


v,.   w,  =1 


or 


iii-b) 


I w. I  =  0     for  at  least  one  i  e  1,2, •  •  •, n 


Proof  Assume  (4.15)  holds  where  D0  is  an  optimizing  D  such  that 


(4.15) 


(4.16) 


(4.17) 


(4.18) 


(4.19a) 


(4.19b) 


(4.20) 


where  /1(A)  is  an  eigenvalue  of  maximum  modulus.  Following  the  development  of  the 
positive  matrix  result,  a  necessary  condition  for  (4.20)  to  hold  is  that  the  corresponding 
right  and  left  eigenvectors  of  D~'AD0  be  dual  with  respect  to  the  Euclidean  norm.  Given 
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that  v  and  w  are  a  pair  of  right  and  left  eigenvectors  of  the  A(A)  eigenvalue  of  A , 
then  D~'v  and  D0w  are  corresponding  right  and  left  eigenvectors  of  Do'AD0. 
Therefore,  the  necessary  condition  is  that  D~'v  and  D0w  are  dual  with  respect  to  the 
Euclidean  norm.  In  the  mathematical  background  section  it  is  stated  that  this  is 
equivalent  to  requiring 

D°'v  =  iiTT  (4-21) 

Using  the  notation 

v  =  [|v,|earg(VlW,|v2|eM8(Vi)^-.-,|vlf|eiBS(v»)y"]T 
the  necessary  condition  (4.21)  is  equivalent  to  the  set  of  scalar  equalities 

v.earg(v')7  = : 2±LiL (4.22-1) 


d0A  d2^  |w,  |2  +  d\2  |w,  |2  +  •  ■  •  +  d]n  \w 


2 


l.i      r  v  d02\w2\e^(Wi)J 

\e^S(vi)j    _  °-2l      2I 

,o,iKr+<2Kr+---+^o,»K12 


v-le*"  = : 2£J_Ur t  (4.22-2) 


1  d    \w  Uarg(H,")7' 

'v,k*'-»=     -     „       °;    T, — -r  (4.22-n) 


•7        I    n|  »2    I        I2  i2    I        I2  /2    I 


'0,1  \rv\\      '  "0,2  '  2  **o,« r  « 


which  leads  directly  to  necessary  condition  (4.18).    Given  that  (4.18)  is  satisfied,  (4.22) 
can  be  rearranged  as 
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v,    w,     -  \WA  V,    w2 


vi  m 


\V2\m  V2     ™2      ~W2 


lV„||W.  N|W2 


V.   w.    —  \w„ 


\v2\\w. 


2 

0,1 

2 

0,2 


O.h 


=      0  (4.23) 


from  which  necessary  condition  (4.16)  becomes  apparent.  For  the  null  space  of  the 
matrix  given  in  (4.23)  to  be  non-trivial,  its  determinant  must  be  0  (i.e.  the  matrix  must  be 
rank  deficient).  First,  note  that  if  any  |w,|  =  0  then  the  corresponding  column  /  is 
composed  of  only  zeros  making  the  matrix  rank  deficient,  resulting  in  part  iii-b)  of 
necessary  condition      (4.19).      For  the  case  when  no 


1^1  =  0    for    i  =  \,2,---,n    the 


determinant  can  be  determined  using  elementary  row  and  column  operations  to  obtain  a 
matrix  that  is  sparse  and  has  the  same  determinant.  Multiplying  column  1  by  -1^]  /|w,| 
and  adding  the  result  to  each  column  i  for  i  =  2,3, ■  •  -,n  gives  the  matrix 


w, 


w, 


w_ 


V, \\WA    -W, 


V2   K 


h  ri 


KK 


\w, 


-w. 


w, 


w, 


w, 


0 

0     -|w. 


Now,  for  /' =  2,3,---,A2 ,  multiplying  row  i  by  Iwl/lw,!  and  adding  the  result  to  row  1 


gives  the  matrix 


(hlhl-1+lv2lKI+KIKI+-  •+KIKDI 
II 

II 


hlh 


V3     Wl 


v»  ri 


-\w2 

0 
0 


-w, 


0     -w„ 


for  which  the  determinant  is 
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±(h||w1|  +  |v2||w2|+---+|vll||wfl|-l)|w1||w2|---|wj 

So,  for  the  case  when  no  ywa  =  0  the  determinant  is  identically  zero  and  the  null  space  is 
non-trivial  only  when  part  iii-a)  of  necessary  condition  (4.19) 


ZHH  =  1 


(4.19) 


;=i 


is  satisfied.  Q.E.D. 

4.4.  Example  4.1 

The  following  example  demonstrates  the  result  of  Theorem  4.2.    Consider  the 
matrix 


A  = 


-0.5259  +  0.6358y'  0.3090-1.3791/  0.2031  +  0.23  \lj  -0.1016  +  0.9524/ 

0.4712  +  0.1 832./  0.7383  -0.5966/  -0.3174-0.1128/  -0.2840-  0.21 21  j 

-0.0290  -  0.10347  -0.7906  +  2.0522y  0.4991  -  0.6463y  -0.0584  +  0.1 540y 

0.0925  -  0.2759y  0.5359 +  0.7832y  0.2490  +  0.083  1/  0.0694  -0.1 91 9j 


where  the   eigenvalue   of  maximum  modulus   is    ^l(A)  =  1.6507- 1.1293 j    such  that 
U(A)|  =  p(A)  =  2  .  Performing  the  minimization  on  the  right-hand  side  of  (4.15)  gives 


min<T(D"'AD)  =  2 

De© 


with 


D0:=  argmino-(D_1AD)  = 

DdD 


10  0  0' 

0    0.5  0  0 

0     0  1.2  0 

0     0  0  0.7 


such  that  equation  (4.15)  holds.  Therefore,  the  three  necessary  conditions  of  Theorem 
4.2  must  be  satisfied.  First,  the  right  and  left  eigenvectors  of  A  associated  with  the 
eigenvalue  A(A)  =  1.6507  - 1.1293  j  of  maximum  modulus  respectively  are 
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v  = 


w 


-0.0845  -  0.2857/ 
0.3690-0.27807 
-0.1441  +  0.79437 
0.0167  +  0.21407 

-0.0567-0.1918J" 

0.9912 -0.7467J 

-0.0672 +  0.3704J 

0.0229 +  0.2932J 


For  condition  i)  the  matrix  N  given  by  (4. 1 7)  is 


N  = 


-0.1881  0.4589  0.0422  0.0258' 

0.0185  -0.5294  0.0655  0.0400 

0.0323  1.2432  -0.2620  0.0698 

0.0086  0.3305  0.0304  -0.2756 


and 


N 


2 

0,1 

2 

0,2 

2 

0,3 

I2 

0,4 


N 


1 
0.25 
1.44 
0.49 


=  0 


such  that  condition  i)  is  satisfied.   Finally,  it  is  easy  to  shows  that  condition  ii)  and  iii-a) 
of  Theorem  4.2  are  satisfied. 

4.5.  Conclusions 
In  this  paper  we  recover  the  dual-norm  arguments  for  the  case  of  complex  A  and 
obtain  an  exact  and  closed  form  expression  for  the  optimal  D  matrix.  This  result  has 
independent  value  in  terms  of  the  mathematical  completeness  of  the  extension  of  the  case 
of  complex  matrices  as  well  as  potential  algorithmic  improvements  in  computing  the 
optimal  scaling  matrices. 


CHAPTER  5 

GENERALIZATION  OF  THE  NYQUIST  ROBUST  STABILITY  MARGIN  AND  ITS 

APPLICATION  TO  SYSTEMS  WITH  REAL  AFFINE  PARAMETRIC 

UNCERTAINTIES 

5.1.  Introduction 

The  critical-direction  theory  developed  by  Latchman  and  Crisalle  (1995)  and 
Latchman  et  al.  (1997)  addresses  the  problem  of  robust  stability  of  systems  affected  by 
uncertainties  that  can  be  characterized  in  terms  of  frequency-domain  value  sets.  The 
approach  introduces  the  Nyquist  robust  stability  margin  kN(a>)  as  a  scalar  measure  of 

robustness  analogous  to  the  structured  singular  value  fj.  (Doyle,  1982)  and  the 
multivariable  stability  margin  km  (Safonov,  1982)  within  the  value-set  paradigm.  This 
chapter  extends  the  critical  direction  theory  to  the  more  general  case  where  the  critical 
value-set  may  be  nonconvex.  The  key  to  extending  the  theory  is  the  introduction  of  a 
generalized  definition  of  the  critical  perturbation  radius  in  a  fashion  that  preserves  all 
previous  results.  The  nonconvexity  of  the  critical  value  set  is  observed  in  a  number  of 
interesting  problems,  including  the  case  studied  by  Fu  (1990)  consisting  of  rational 
systems  where  the  uncertainty  appears  affmely  in  the  form  of  real  parameters  that  belong 
to  a  known  rectangular  polytope.  The  generalized  critical  direction  theory  is  applied  to 
this  particular  class  of  uncertain  systems,  and  is  used  to  calculate  the  required  Nyquist 
robust  stability  margin  with  high  precision  and  in  the  context  of  a  computationally 
manageable  framework. 

The  robust  stability  problem  studied  by  Fu  is  part  of  an  extensive  literature  on 
systems  where  the  uncertainty  appears  in  the  form  of  parameters  that  vary  in  prescribed 
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real  intervals,  a  situation  of  relevance  to  many  engineering  problems.  Early  advances  in 
this  field  are  due  to  Kharitonov  (1978,  1979)  who  derived  necessary  and  sufficient 
conditions  for  the  robust  stability  of  interval  polynomials,  that  is,  polynomials  with 
independent  coefficients  that  take  values  in  closed  real  intervals.  An  extension  of 
Kharitonov' s  theorem  to  rational  interval  plants  is  proposed  in  Chapellat  et  al.  (1989), 
where  the  objective  is  to  assess  the  stability  of  a  family  of  plants  by  testing  a  subset  of 
extreme  plants  or  extreme  segments.  The  number  of  extreme  plants  required  to  determine 
robust  stability  depends  on  the  functional  relationship  between  the  uncertain  parameters 
and  their  bounding  interval-sets.  Comprehensive  results  based  on  extreme  plants  or 
segments  are  known  to  exist  only  for  a  restricted  set  of  uncertainty  structures.  A  detailed 
account  of  Karitonov-like  methods  can  be  found  in  Barmish  (1994)  and  in  the  references 
therein.  For  contextual  value,  it  is  worth  mentioning  that  many  of  the  methods  proposed 
are  based  on  determining  the  stability  of  a  set  of  Kharitonov  plants  (or  extreme  plants) 
derived  from  the  interval  bounding-set  description.  For  example,  Chapellat  et  al.  (1989) 
and  Bartlett  et  al.  (1990)  give  conditions  that  use  32  Kharitonov  segments  or  edges. 
Barmish  et  al.  (1992)  prove  that  when  using  first-order  compensators  it  is  necessary  and 
sufficient  that  sixteen  of  the  extreme  plants  be  stable;  furthermore,  under  certain 
conditions  only  eight  or  twelve  plants  are  necessary. 

In  this  chapter  the  generalized  critical  direction  theory  is  applied  to  systems  with 
affine  parametric  uncertainty  and  exploits  earlier  results  of  Fu  (1990)  regarding  the 
mapping  of  the  uncertain  parameters  from  their  polytopic  domain  to  the  Nyquist  plane  to 
develop  a  computationally  tractable  algorithm  for  calculating  the  Nyquist  robust  stability 
margin.  The  chapter  is  organized  as  follows.  Section  5.2  generalizes  the  critical  direction 
theory  for  systems  with  nonconvex  critical  value  sets.     Sections  5.3  through  5.8  are 
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concerned  with  the  application  of  the  generalized  theory  to  the  case  of  affine  uncertain 
rational  systems  with  real  polytopic  parametric  uncertainties.  Section  5.3  introduces  a 
precise  definition  of  the  uncertain  system  considered,  and  Section  5.4  derives  two  robust- 
stability  theorems  for  these  types  of  systems.  Section  5.5  presents  a  systematic  method 
for  calculating  the  critical  perturbation  radius,  and  Section  5.6  provides  two  examples  of 
the  analysis  method,  including  the  case  of  a  convex  and  the  case  of  a  nonconvex  critical 
value  set.  Overall  conclusions  are  given  in  Section  5.7. 

5.2.  Generalization  of  the  Critical  Direction  Theory 
5.2.1.  Preliminaries 

Consider  the  single-input  single-output  linear  time  invariant  system 

g(s)  =  g0(s)  +  S(s)  (5.1) 

where  g0(s)  is  a  known  nominal  transfer  function,  and  S(s)eA  is  an  unknown 
perturbation  belonging  to  a  known  perturbation  family  A  .  The  focus  of  this  analysis  is 
on  the  robust  stability  of  the  closed-loop  system  that  results  when  the  uncertain  system 
(5.1)  is  configured  in  the  unity  negative  feedback  control  structure  shown  in  Figure  5.1. 


•o-*i 


g(s) 


Figure  5.1.  Unity  feedback  control  scheme  for  an  uncertain  plant  g(s) . 
The  following  standard  assumptions  are  made  throughout  this  chapter: 

(Al)      The  nominal  transfer  function    g0(s)    is  stable  under  unity  negative 
feedback. 
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(A2)  The  set  of  allowable  perturbations  A  is  such  that  g(s)  and  g0(s)  have  the 

same  number  of  open  loop  unstable  poles. 
The  robust  stability  analysis  is  based  on  a  frequency  domain  description  of  the  uncertain 
perturbations  using  value  sets.    The  uncertainty  value  set  of  g(s)   at  frequency  co  is 
defined  as 

V(co)  :=  {g{jco)  |  g{jco)  =  g0(jco)  +  S(jco),    S(s)  e  A} 
and  V{w)  is  said  to  lie  on  the  Nyquist  plane.  A  generic  uncertainty  value  set  is  shown  in 
Figure  5.2. 


g0{jcox)  +  dc{cox)  -  *► 
goCMHPcC^iKK) 


PcKh 


>      RegO'«) 


go<Ja) 


Figure  5.2.  Schematic  of  an  uncertainty  value  set  ^(co^ (shaded  area), 
and  the  critical  perturbation  radius  pc{cox)  at  a  frequency  cox.  Also 
shown  in  the  figure  are  the  critical  line  r{cox)  (dashed  line);  and  the 
nonconvex  critical  uncertainty  value  set  Vc(<y,)  which  in  this  case  is  the 
union  of  two  disjoint  straight-line  segments  (shown  by  the  dotted  lines). 
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The  critical-direction  theory  advanced  in  Latchman  and  Crisalle  (1995)  and  in 
Latchman  et  al.  (1997)  is  based  on  the  observation  that  the  smallest  destabilizing 
perturbations  occur  along  the  critical  direction 

|i+a,CH 

which  is  interpreted  as  the  unit  vector  with  origin  at  the  nominal  point  g0(jco)  and 
pointing  towards  the  critical  point  -1+yO  (cf.  Figure  5.2).  This  direction  in  turn  defines 
the  critical  line  r(co)  :=  g0(ja>)  +  adc(j(o) ,  a  eR+ ,  where  R+  denotes  the  nonnegative 
real  numbers.  The  critical  line  r(co)  is  interpreted  as  a  ray  that  originates  at  the  nominal 
point  g0(jco)  and  passes  through  the  critical  point  -1  +  y'O.  The  intersection  of  the 
uncertainty  value  set  with  the  critical  line  determines  the  critical  uncertainty  value  set 
Vc{co)  :=  V{co)  n  r{co)  which  may  be  (/')  a  single  straight-line  segment  or  a  single  isolated 
point  (in  which  case  Vc{co)  is  a  convex  set)  or  (ii)  a  union  of  disjoint  straight-line 
segments  and  isolated  points  (in  which  case  Vc(co)  is  a  nonconvex  set).  Figure  5.2  shows 
the  case  of  a  nonconvex  critical  uncertainty  value  set.  Finally,  the  boundary  of  the 
uncertainty  value  set  is  denoted  dV(a>) ,  and  the  set  of  critical  boundary-  intersections 
(Bc  (co)  is  defined  as 

<BC  (©):=  {dV{w)  n  r(co)}  \  g0  (jco) 
where  "\"  is  the  set-difference  operator.     For  the  special  case  where  dV(o))nr(co) 
contains  g0(ja>)  as  its  only  element,  the  following  definition  is  applied: 

«r(fl>):=  {go  (»} 
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Note  that  to  determine  <BC  (co)  it  is  necessary  to  have  knowledge  of  the  uncertainty  value 
set  boundary  only  along  the  critical  line.  Clearly,  <BC  (co)  contains  a  single  element  if 
Vc(co)  is  a  convex  set,  and  contains  at  least  two  elements  if  Vc(co)  is  nonconvex. 

When  the  critical  value  set  Vc(co)  is  convex  (as  in  the  case  of  star-shaped  value 
sets  with  respect  to  the  nominal  point,  for  example),  the  critical  perturbation  radius  is 
defined  as  (Latchman  and  Crisalle,  1995;  Latchman  et  al.,  1997) 

pc(o)):=  max  {a  \z  =  g0(jco)  +  adc(jco)  eV(co)  }  (5.2) 

Definition  (5.2)  states  that  the  critical  perturbation  radius  for  the  case  of  a  convex  set 
Vc(co)  is  simply  the  distance  along  the  critical  direction  between  the  nominal  point 
g0(jco)  and  the  uncertainty  value  set  boundary  d'V(co) .  Note  also  that  the  perturbation 
radius  captures  the  "size"  of  the  uncertainty  that  is  relevant  for  stability  analysis. 
Definition  (5.2)  is  not  suitable,  however,  for  the  case  of  nonconvex  critical  value  sets 
Vc(co).  In  this  chapter  the  following  generalization  of  the  definition  of  the  critical 
perturbation  radius  is  proposed,  which  is  applicable  to  both  the  convex  and  nonconvex 


cases: 


\\l  +  g0(jco)\-Z(co)    if-l+J0eV(e>) 
p(co):=<  (5.3) 

•  \\  +  g0(jco)\  +  ^(co)    otherwise 


where 


t(co)  =    minll  +  zl  (5.4) 

represents  the  distance  from  -1  +  y'O  to  the  point  in  <2f(&>)  that  is  closest  to  the  critical 
point  -1  +  y'O.  The  upper  statement  in  definition  (5.3)  states  that  when  -1  +  jO  is  not  an 
element  of  V(co) ,  the  critical  perturbation  radius  pc{co)  is  defined  as  the  difference 
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between  two  distances,  namely,  the  distance  from  the  critical  point  -l+j"0  to  the 
nominal  point  g0(jco)  (represented  by  |  \  +  g0(jo))  |)  and  the  distance  from  the  critical 
point  -1  +  y'O  to  the  closest  critical-boundary  intersection  (represented  by  £(&>)).  On  the 
other  hand,  when  —  1  +  y'O  is  an  element  of  V{aj) ,  the  lower  statement  in  (5.3)  states  that 
the  critical  perturbation  radius  is  taken  as  the  sum  of  the  two  distances  in  question. 
Observe  that  when  the  critical  uncertainty  value  set  is  convex,  <8f(<y)  has  only  one 
element  (i.e.  there  is  only  one  critical  boundary  intersection),  and  definition  (5.3) 
becomes  equivalent  to  definition  (5.2).  Note  also  that  to  compute  the  critical  perturbation 
radius  from  (5.3)  it  is  necessary  to  have  full  knowledge  of  the  set  of  critical  boundary 
intersections  <Bf(<y)  and  to  be  able  to  evaluate  whether  the  set  membership  condition 
-1  +  y'O  G^(fy)  holds;  both  of  these  issues  are  completely  resolved  in  Section  5.3  and 
Section  5.4  of  this  chapter  for  the  case  of  systems  with  real  affine  parametric 
uncertainties.  For  either  definition  it  can  be  shown  that  pc(co)  >  0  for  all  frequencies. 
Finally,  the  Nyquist  robust  stability  margin 

M^):=h  Pc(f.\i  (5-5X6) 

is  defined  as  the  ratio  of  the  critical  perturbation  radius  to  the  distance  between  the 
nominal  point  g0(jco)  and  the  critical  point  -1  +  y'O  measured  along  the  critical 
direction.  Note  that  kN(co)>0  for  all  frequencies. 

5.2.2.  Analysis  of  Robust  Stability 

The  analysis  of  the  robust  stability  of  the  uncertain  feedback  system  being 
considered  can  be  resolved  in  terms  of  the  following  theorem. 
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Theorem  5.1.  Consider  the  uncertain  system  g(s)  given  in  (5.1)  with 
assumptions  (Al)  and  (A2).  Then,  the  closed  loop  system  is  robustly  stable  under 
unity  feedback  if  and  only  if 

-l  +  jOeV(co)\/co  (5.7) 

Theorem  5.1  is  simply  a  restatement  of  the  well-known  zero-exclusion  principle 
(Barmish,  1994),  and  it  gives  a  necessary  and  sufficient  condition  for  the  robust  stability 
of  the  closed  loop  in  question.  However,  Theorem  5.1  does  not  provide  a  measure  of  the 
degree  of  robust  stability  of  the  loop,  a  quantity  that  would  be  most  useful  as  the  basis  for 
the  synthesis  of  optimally  robust  controllers  or  for  the  assessment  of  the  relative  merits  of 
alternative  control  schemes.  The  critical  direction  theory  seeks  to  quantify  the  robust 
stability  of  such  systems  in  terms  of  the  Nyquist  robust  stability  margin  (5.5),  which  plays 
a  role  analogous  to  that  of  the  structured  singular  value  (Doyle,  1982)  and  of  the 
multivariable  stability  margin  (Safonov,  1982).  Efficiency  in  the  analysis  is  obtained 
through  the  realization  that  it  suffices  to  verify  condition  (5.7)  only  for  value-set  points 
that  lie  along  the  critical  direction;  more  precisely,  the  set  membership  condition  (5.7) 
holds  if  and  only  if  -1  +  y'O  £Ve{<D)  holds.  These  observations  lead  to  the  following  key 
result  of  the  critical  direction  theory. 

Theorem  5.2.  Consider  the  uncertain  system  g(s)  given  in  (5.1)  with 
assumptions  (Al)  and  (A2).  Then  the  closed  loop  system  is  robustly  stable  under 
unity  feedback  if  and  only  if 

kN(co)<\    Vo  (5.8) 

Proof.    A  complete  proof  is  given  in  Latchman  and  Crisalle  (1995)  for  the  case 

where  Vc(a>)  is  convex.    For  the  non-convex  case  in  which  the  generalized  definition 
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(5.3)  of  pc{co)  is  utilized  the  proof  is  extended  as  follows.  From  Theorem  5.1  the 
uncertain  closed  loop  system  is  stable  if  and  only  if  -1  +  y'O  gV(&>)  \/co  .  Therefore,  to 
prove  that  (5.8)  is  sufficient  for  robust  stability,  we  must  show  that  if  kN(co)  <1  V&> 
then  —l  +  y*0  <tV{co)  \/co.  To  prove  by  contradiction,  assume  that  kn(co)  <  1  V&>  and 
that  3 co  such  that  —  1 +  7O  ^V{co) .  Then  applying  definitions  (5.3)  and  (5.5)  for  a 
frequency  at  which  -1  +  y'O  e  V(co)  gives 


1  +  goU®)  \  1 1  +  g0U<»)  I  I  1  +  goU®)  I 

where  £(&>)  is  the  nonnegative  real  scalar  given  by  (5.4).  Hence,  kN(co)  >  1  for  at  least 
one  frequency,  which  contradicts  the  assumption.  Therefore,  if  ku(co)<\  \fco  then  it 
follows  that  — I  +  7O  &  V{co)  \/co .  To  prove  that  (5.8)  is  necessary  for  robust  stability, 
one  must  show  that  if  -1+/0  £V{co)  \/co  then  &N (co)  <  1  V©  .  To  establish  this,  note 
that  if  -I  +  7O  £V(co)  Vco  then  by  definitions  (5.3)  and  (5.5) 

Ir    <r*              EsM                l1+go(»l-^)        .                4{G>)  .. 

KN{CO)  =  -, 7  —  : : =1  —  ■: r      V  CO 

I  1  +  got/®)  I  I  1  +  go(Ja)  I  I  1  +  gQ{JG>)  I 

where £(&>)  is  given  by  (5.4).  In  this  case,  however,  since  — I  +  7O  <£V(co)  it  follows  that 
-1  +  y'O  £  <Bc(co) ,  and  thus  £(&>)  must  necessarily  be  a  positive  number.  Using  this  fact  in 
the  above  equality  leads  to  the  conclusion  that  k^(co)  <  1  \/co  .  Q.E.D. 

From  Theorem  5.2  it  follows  that  the  scalar  k^(co)  serves  to  quantify  the  robust 
stability  of  the  closed-loop  system.  The  computation  of  kN(co)  requires  knowledge  of  the 
critical  perturbation  radius pc(co) defined  in  (5.3).  The  challenging  task  in  a  given 
problem  is  in  fact  the  calculation  of  the  critical  perturbation  radius. 


87 

When  Vc{co)  is  convex,  definition  (5.2)  indicates  that  pc{co)  represents  the 
distance  between  the  point  g0(y<w)  and  the  (unique)  point  where  the  critical  line 
intersects  the  boundary  of  V{co) .  On  the  other  hand,  when  Vc{co)  is  nonconvex  there  are 
multiple  points  where  the  critical  line  intersects  the  boundary  of  V(a>) .  In  such  cases, 
definition  (5.3)  indicates  that  pc(a>)  is  a  function  of  the  distance  between  g0(ja>)  and 
the  boundary-intersection  point  that  is  closest  to  the  critical  point  -1+jQ.  Since  in  many 
cases  the  convexity  of  Vc{co)  at  any  given  frequencies  may  not  be  known  a  priori,  the 
generalized  critical  radius  definition  allows  the  application  of  the  critical  direction  theory 
without  conservatism  to  a  more  general  class  of  uncertain  systems,  including  the  case  of 
real  affine  uncertain  systems  discussed  in  ensuing  sections.  The  Nyquist  robust  stability 
margin  k^{co)  computed  using  the  general  definition  (5.3)  for  pc{co)  is  attractive  from 
an  analysis  standpoint  because  through  Theorem  5.2  it  gives  necessary  and  sufficient 
conditions  for  robust  stability.  On  the  other  hand,  if  &N  (a>)  is  computed  using  equation 
(5.2)  for  pc{co),  then  the  condition  kN(o))  <  1  Vco  is  only  sufficient  for  robust  stability 
when  the  set  Vc{co)  is  nonconvex.  From  a  control  design  point  of  view,  however,  it  may 
be  advantageous  to  adopt  the  computationally  simpler  definition  (5.2)  even  for  the  case 
where  Vc{cd)  is  nonconvex,  and  accept  the  result  as  a  suboptimal  design,  as  is  done  in  the 
context  of  the  structured  singular  value  paradigm  where  control  design  is  based  on  an 
upper  bound  rather  than  on  the  exact  value  of  the  structured  singular  value.  It  must  be 
remarked,  however,  that  when  Vc(co)  is  in  fact  convex,  using  definition  (5.2)  for  pc{a>) 
makes  the  resulting  condition  k^{co)<\  Vco  necessary  and  sufficient  for  robust 
stability;  and  in  such  cases  the  results  are  not  conservative.   It  must  also  be  emphasized 


that  the  uncertainty  value  set  V{(o)  itself  does  not  have  to  be  convex  for  the  critical 
uncertainty  value  set  Vc(co)  to  be  convex 

5.3.  Systems  with  Affine  Uncertainty  Structure 

In  this  section  the  generalized  critical  direction  theory  is  specialized  to  systems 
with  real  parametric  uncertainties  that  appear  in  an  affine  fashion,  namely,  an  uncertain 
rational  function  of  the  form 

p 

g(s,q)  = M ,    qeQ  (5.9a) 

d0(s)  +  2^qA{s) 


i=i 


where 


t 
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n0(s):=Y,noks 


*=o 


and 


d0(s)-=^doks 


U0k* 
A=0 


are  known  nominal  polynomials, 


t 

k 


«,(*)= 2X5 


*=0 


and 


di(s)  =  Yjdiks 


uiksk 

k=Q 


are  known  perturbation  polynomials,  and  q  =  [ql  q2  ...  qp]T  eRp    is  a  vector  of  real 
perturbation  parameters  belonging  to  the  bounded  rectangular  polytope 

Q={qeRp\     q;<qi<q;,i  =  l,2,...,p}  (5.9b) 
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where  q]  and  q[  ,  i  =1,2,  ...,/?  are  finite  real  bounds.  Equations  (5.9a)-(  5.9b)  define  a 
class  of  finite-dimensional,  linear,  time-invariant,  real  systems  with  affine  parametric 
uncertainties.  For  completeness,  the  perturbation  family  A  is  implicitly  understood  to  be 
the  set  A:=  {S(s,q)  =  g(s,q)-  g(s,q0)  \q  eQ]  for  this  class  of  uncertainties. 

The  value  set  V{co)  at  a  given  frequency  co  is  defined  as  the  set  of  the  Nyquist- 
plane  points  g(jco,q)  obtained  for  all  q  eQ.  Let  dV{co)  represent  the  boundary  of  the 
uncertainty  value  set  V{co)  and  E(Q)  represent  the  2P~X  p  edges  of  the  bounding  set  Q. 
Furthermore,  let  g(jco,E(Q))  represent  the  frame  of  the  value  set,  namely,  the  image  of 
the  edges  of  Q  on  the  Nyquist  plane  under  the  mapping  g(jco,q).  Two  important 
properties  of  the  value  sets  generated  by  systems  with  affine  uncertainty  are  the  following 
(Fu,  1990):  (7)  at  each  fixed  frequency  the  boundary  dV{a>)  of  the  uncertainty  value  set 
V{cd)  is  spanned  by  the  image  of  the  edges  of  Q,  e.g.,  dV{co)  is  spanned  by  the  frame  of 
the  value  set;  (ii)  the  image  of  each  edge  of  Q  is  either  a  circular  arc  or  a  line  segment  that 
can  be  easily  calculated  analytically.  The  second  property  is  a  consequence  of  the  affine 
structure  of  the  uncertainty  which  induces  a  linear  fractional  mapping.  In  the  following 
sections  we  exploit  these  properties  to  develop  a  computational  approach  to  find  the 
Nyquist  stability  margin  for  affine  uncertain  systems.  The  results  allow  the  efficient 
verification  of  the  set  membership  (5.7)  invoked  in  Theorem  5.1  via  a  linear  feasibility 
problem,  and  permit  the  calculation  of  the  robust  stability  margin  invoked  in  Theorem  5.2 
via  a  systematic  algorithm. 

5.4.  Robust  Stability  and  Uncertainty  Value-Set  Membership 

The  first  step  in  the  computation  of  the  generalized  critical  perturbation  radius  for 
the  uncertain  system  (5.9a)-(5.9b)  is  to  determine  whether  the  critical  point  -1  +  yO 
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belongs  to  the  uncertainty  value  set  V{co) .  The  more  general  problem  of  determining  if 
an  arbitrary  point  wgC  belongs  to  V(co)  is  solved  in  this  section,  and  the  results  are 
then  utilized  to  reformulate  Theorem  5.1  in  terms  of  computable  quantities. 


The  affine  uncertain  system  (5.9)  can  be  written  in  the  vector-matrix  form 


[■ 


g(s,q) 


[> 


s        s 


'oo 


'0/ 


+ 


m-l  m 

s         s 


do\ 

+ 

vL     0m_ 

«10 

«20 

".. 

n2] 

nu 

n2l 

d\o 

dio 

dn      d2 


d,m     dlm 


pO 

M] 

pi 

?2 

1P<- 

?x 

lp0 


>1 


•    d 


pm 


Ml 

?2 

ki 

sTd(d0  +  Dpq) 


(5.10) 


where  sn  and  sd  are  vectors  of  lengths  ^  +  1  and  m  +  \,  containing  powers  of  the  Laplace 

variables,  and  where  «0  eR(+1,  d0  gRw+1,  Np  eR{c+i)xp  and  Dp  eR{m+l)xp  are  constant 

vectors  and  matrices  that  represent  the  structure  of  the  affine  parametric  uncertainty.  The 
value  set  at  frequency  co  is  obtained  by  evaluating  (5.10)  at  s  =  jco  for  all  q  e  Q  to  yield 


sO°)^)  =  -rr1 n —     -r ), — „ — v**Q> 


(5.11) 
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and 
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>i 
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eR 


f(m+l)/2]xp 


where  ["•]  represents  the  greatest-integer  function. 
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Now  consider  an  arbitrary  Nyquist-plane  point  w=wR+jWj  e  C  with  finite 
magnitude.  Clearly,  w  e ^(co)  if  and  only  if  there  exists  a  vector  q  eQ  such  that 
g{jca,q)  =  w.    Using  (5.11)  this  condition  is  equivalent  to  finding  a  vector  q  eQ  that 


solves  the  equation 


sTdlR{d0,R+DpMq)  +  jsTdj{d0j+DpJq) 


=  wR+jwI 


(5.12) 


This  problem  can  be  characterized  as  a  linear  equality/inequality  problem  as  shown 

below. 

Theorem  5.3.  Let  w  eC  be  an  arbitrary  point  with  finite  magnitude  on  the 
Nyquist  plane.     Then   w  e  V{co)   if  and  only  if  there  exists  a  feasible  solution 

q  e  Rp  to  the  linear  equality 


A(w)  q  -  b(w) 


(5.13a) 


subject  to  the  linear-inequality  constraint 


1 

0 

0    •• 

•      0 

ql 

■1 

0 

0    •• 

•      0 

-<f\ 

0 

1 

0    •• 

•      0 
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0    •• 

0 

q^ 

~a2 

0 

0 

0    •• 

•     1 

*; 

0 

0 

0    ■• 

•  -1 

r*p 

(5.13b) 


where 


A(w):  = 


Sn,RNp,R  ~  WRSIrDP,R  +  WISdJDp,I 

snjNpJ  -  wRs*jD  j  -  wlSTdtRD  R 


eR 


2xp 


(5.14) 


b(w):  = 


~Sn,Rn0,R  +WRSd.R^0,R 


WISd,Id0.I 


>    //  ,  +  wRslfd0<l+w,slRd0<R 


J 
V/"0,7 


gR2 


(5.15) 
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Proof.  A  finite  point  w  e  C  belongs  to  V{co)  if  and  only  if  there  exists  a  vector 
^fG^  that  satisfies  equation  (5.12).  Since  the  denominator  on  the  left-hand  side  of  (5.12) 
is  non-zero  due  to  the  finite  magnitude  of  w,  the  equality  can  be  rationalized  by 
multiplication  by  the  denominator,  and  after  isolating  the  real  and  imaginary  parts  leads  to 
the  equivalent  set  of  equations 


Sn,RNp,R-WRSd,RDp,R+WISllDp,I 
SllNp,I  ~  WRSd,IDpJ  ~  WISd,RDp,R 


P.*  J 


~Sn,Rn0,R  +WRSd,R^0,R 


WISdJdOJ 
-^/"O,/  +WgSd,I<Io,I+WISlRdO,R 


which  becomes  equality  (5.13a)  after  the  matrix  A(w)  and  the  vector  b(w)  are  defined  as 
given  in  (5.14)  and  (5.15),  respectively.  From  (5.9b),  the  restriction  that  q  eQ  can  be 
described  by  the  linear  inequality  (5.13b).  Hence,  it  follows  that  w  eC  is  an  element  of 
V{co)  if  and  only  if  there  exists  a  feasible  solution  to  the  linear  equality/inequality 
problem  (5.13).  Q.E.D. 

Theorem  5.3  poses  the  uncertainty  value  set  membership  problem  as  a  standard 
linear  equality/inequality  feasibility  problem  whose  solution  can  be  found  in  classical 
linear  programming  references  (see  Luenberger,  1984,  for  example).  The  linear  map 
given  by  (5.13)-(5.15)  is  obtained  for  the  rational  function  (5.1)  with  the  affine 
uncertainty  structure  given  by  (5.9).  A  formally  analogous  linear  map  based  on  the  zero- 
exclusion  principle  has  been  developed  by  Bhattacharyya  et.  al.  (1995)  for  the  case  of 
affine  uncertain  polynomials.  Equations  (5.13)-(5.15)  constitute  an  extension  of  that 
approach  to  the  case  of  rational  systems,  and  a  generalization  to  the  case  of  an  arbitrary 
point  on  the  complex  plane. 

To  calculate  the  critical  perturbation  radius  using  (5.3)  it  is  necessary  to  determine 
if  -1  +  y'O  is  an  element  of  the  uncertainty  value  set  V{co) .    From  Theorem  5.1,  the 
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solution  to  this  set-membership  problem  directly  defines  the  robust  stability  of  the 
system.  Therefore,  the  robust  stability  of  systems  with  real  affine  parametric  uncertainty 
can  be  efficiently  determined  by  solving  a  feasibility  problem  by  specializing  Theorem 
5.3  to  the  case  w  =  —  1+jO .  The  result  is  given  in  the  following  theorem. 

Theorem  5.4.  Consider  the  real  affine  uncertain  system  given  in  (5.6a)-(5.6b) 
with  assumptions  (Al)  and  (A2).  Then  the  closed  loop  system  is  robustly  stable 
under  unity  feedback  if  and  only  the  following  linear  equality/inequality  problem 


in  q  e  Rp  is  infeasible  at  all  frequencies: 


subject  to 


A(-l  +  jO)q  =  b(-l  +  jO) 
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(5.16a) 


(5.16b) 


where 


A(-\  +  j0)  = 


s    N      +s     D 

*n,Rly  p,R  T  ^d.R^p^R 

ST  TV     +sT  D 


R 


2-x.p 


b(-\  +  J0)  = 


~Sn,Rn0,R        Sd,Ri*0,R 

-sT  n     -sT  d 

■V/     0,/       3d,Iu0,I 


R: 


Proof.  From  Theorem  5.1,  the  condition  -I  +  7O  €^(co)  for  all  frequencies  co  is 
necessary  and  sufficient  for  ensuring  robust  stability.  The  result  follows  by  applying 
Theorem  5.3  to  the  special  case  of  the  point  w  =  -1  +  jO .  Q.E.D. 


95 

Theorem  5.4  represents  a  systematic  method  for  determining  the  robust  stability  of 
systems  with  real  affine  parametric  uncertainties. 

5.5.  Computation  of  the  Critical  Perturbation  Radius 
The  computation  of  the  Nyquist  robust    stability  margin  kN  (co)  for  the  affine 

uncertain  system  (5.9)  requires  that  the  critical  perturbation  radius  pc(co)  be  calculated 
first.  As  indicated  by  (5.3)  and  (5.4),  this  in  turn  requires  determining  the  set  (8c(ft>). 
Once  that  all  the  elements  of  <SC  (co)  have  been  identified  it  is  straightforward  to  calculate 
pc(co)  from  the  applicable  formulas.  For  the  case  of  affine-uncertain  systems  of  the  form 
(5.9a),  the  critical  boundary-intersections  set  (Bf(<y)  can  be  effectively  identified  using  a 
two-step  strategy. 

The  first  step  consists  of  finding  the  set  of  points  F  =  {Pt,  i=  1,  2,...,  k)  that 
correspond  to  all  the  intersections  between  the  critical  line  r(co)  with  the  frame 
g(jco,E(Q)) .  This  reduces  to  a  simple  problem  in  two-dimensional  computational 
geometry  after  recognizing  that  the  critical  line  is  a  ray  and  that  the  frame  is  a  collection 
of  arcs  of  circles  and  straight-line  segments.  Further  details  are  given  in  Section  5.6. 
Note  that  all  the  points  in  the  frame-intersection  set  F  are  elements  of  Vc(co)  because 
each  point  in  turn  belongs  to  r(co) .  It  is  straightforward  to  conclude  that  <3C  (co)  c:  F  after 
arguing  that  some  elements  of  the  frame-intersection  set  F  may  not  lie  on  the  value-set 
boundary  dV(co) ,  and  that  all  elements  of  <Bc(co)  must  be  elements  of  F. 

The  second  step  consists  of  constructing  the  set  of  critical  intersections  <BC  (co)  by 
selecting  from  F  all  the  points  that  also  belong  to  <BC  (co) .  For  the  special  case  where 
F  =  {g0(jco)}  it  follows  that  <Bc(co)  =  {g0(jco)}.  For  the  more  general  case  where 
g0(jco)  is  not  the  only  element  of  F,  then  (Bc(co)  is  constructed  by  selecting  all  the  points 
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in  F  that  lie  on  the  boundary  dV(co) .  In  accordance  with  the  general  definition  of 
<Bf  (co) ,  the  point  g0(jco)  must  be  excluded.  Therefore,  if  g0 (ja>)  e  F ,  the  set  F  must  be 
redefined  by  excluding  from  it  the  element  g0(jco)  through  the  assignment 
F\g0(ja>)  ->  F.  Without  loss  of  generality,  assume  that  the  points  in  F  are  ordered  in 
increasing  distance  from  the  nominal  point  gQ(jco) ,  and  that  Px  and  Pk  are  respectively 
the  closest  and  farthest  points  from  g0(jco) .  Clearly,  <2f(<y)  contains  the  point  Pk . 
When  k  >  1  the  additional  elements  of  (Bc(co)  can  be  readily  identified  from  F  by 
considering  all  the  straight-line  segments  PnPn+l,  n—  1,  2,  ...,  k-1,  Pn  eF .  Clearly,  if 
one  interior  point  of  the  segment  P„Pn+l  lies  outside  the  value  set  'V(co),  it  can  be 
concluded  that  both  Pn  and  Pn+X  lie  on  the  boundary  of  the  value  set,  and  hence,  both 
Pn  and  Pn+]  are  elements  of  <2f(<y).  It  is  easily  seen  that  this  is  a  necessary  and 
sufficient  condition  for  the  membership  of  Pn  and  Pn+X  in  Q^co).  All  the  elements  of 
<2f  (co)  can  be  systematically  identified  through  such  a  sequential  analysis  of  the  points  in 
F.  Note  that  to  determine  if  the  end  points  of  a  given  segment  P„Pn+x  belong  to  <BC  (co)  it 
suffices  to  test  if  any  interior  point,  say  the  midpoint,  of  the  segment  lies  outside  of 
Vc(co).  This  set  membership  condition  can  be  readily  determined  by  applying 
Theorem  5.3  to  the  segment  midpoint. 

5.6.  Intersection  of  a  Ray  and  Arcs  in  the  Complex  Plane 

As  discussed  above,  the  first  step  in  the  computation  of  the  critical  perturbation 

radius  consists  of  finding  the  set  of  points  F  =  { Pj(co)  ,i  =1,2,...,  k)  that  correspond  to 

intersections  of  the  critical  line  r(co)  with  the  frame  g(jco,E(Q)) .   This  is  equivalent  to 

determining  the  intersection  of  a  ray  (the  critical  line)  and  arcs  or  straight-line  segments 
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(the  frame)  in  the  complex  plane.  The  equations  needed  for  characterizing  the 
intersection  of  the  critical  line  with  straight-line  segments  are  trivial  and  are  therefore 
omitted  for  brevity.  The  determination  of  all  the  intersections  of  the  critical  line  with  a 
finite  number  k  of  arcs  of  circle  is  somewhat  more  subtle  and  is  therefore  discussed  in 
greater  detail. 

The  basic  geometric  objects  of  interest  are  defined  as  follows.  A  line  passing 
through  two  points  p0,  px  gC  can  be  represented  by 

HPo>Pi):={zeC\z  =  Po+t(Pi-Po)>    feR}  (5-17) 

The  same  relation  can  be  used  to  represent  a  ray,  r(p0,  px)  with  origin  at  pQ  and  pointing 
towards  /?,  by  restricting  the  parameter  t  to  adopt  only  non-negative  values.  The  line 
L(A>>  Pi)  is  said  to  be  the  supporting  line  for  r(/?0,/?,) .  The  critical  direction  r(a>)  is 
therefore  represented  by  the  ray  r(/?0,/?,)  with  p0  =  gQ(jco)  and  px  =-l  +  y'0. 

Consider  a  set  k  of  circles  with  center  at  points  z(.  e  C  and  radii  rt  eR , 
i  -  1,2, •  •  •  ,k  ,  where  each  circle  satisfies  the  relation 

C,:={  zeC|  (z-z,.)(z-z(.)  =  ^.2,    r;.>0}  (5.18) 

Therefore,  two  parameters  are  sufficient  to  define  each  circle.  On  the  other  hand,  three 
parameters  are  required  to  describe  an  oriented  circular  arc  that  passes  through  three 
points  a0,al,a2  sC,  in  that  order,  and  such  arc  will  be  denoted  implicitly  as 
a,(a0,  ax,a2)  if  it  belongs  to  a  supporting  circle  C, . 

The  relative  position  of  a  set  of  points  and  the  orientation  of  arcs  and  rays  in  the 
plane  can  easily  be  determined  invoking  the  cross  product  operation.  Let  p0,  px  eC,  then 
the  cross  product  of  p0  and  px  can  be  defined  as 
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p0xP]:^]m{poP]) 

The  sign  of  the  cross  product  determines  the  relative  orientation  of  p0  and  /?,  with 
respect  to  the  origin:  px  lies  to  the  left  (right)  of  p0  if  p0  x  px  <  0  (>0),  and  px  and  p0 
are  collinear  if  p0  x  px  =  0  (Cormen  ef  a/.,  1990).  The  cross  product  can  also  be  used  to 
determine  the  direction  in  which  an  oriented  circular  arc  "turns".  Let  a(a0,ax,a2)  be  a 
circular  arc  that  originates  at  a0 ,  passes  through  ax ,  and  ends  at  a2 .  Then,  the  arc  turns 
left  (right)  if  (a2  - a0)  x  (ax  - a0)  >  0  (<  0) .  If  (a2  -  a0)  x(ax-a0)  =  0  the  three  points  are 
collinear,  and  the  arc  degenerates  into  a  line  segment.   The  arc  a(a0,ax,a2)  is  said  to  be 

positive  (negative)  if  it  turns  left  (right). 

The  following  four-step  strategy  to  compute  the  intersection  of  a  ray  and  a  finite 
number  of  arcs  is  proposed:  (/)  find  I{,)  -  L(g0(y'<y,-l  +  yO))nC,,  the  set  of  the 
intersections  (if  any)  of  the  supporting  line  (i.e.,  the  line  that  contains  the  critical  ray)  and 
the  i-th  supporting  circle  (i.e.,  the  circle  that  contains  the  z'-th  arc);  (ii)  find 
I{r°  =  r(g0(jco,-l  +  y0))n  I{,) ,  the  set  of  intersections  of  the  critical  ray  and  the  supporting 
circle;  (hi)  find  1^  =  a,-  n  7r(0  c  7r(,) ,  the  intersections  of  the  critical  ray  and  the  i-th  arc; 
and  (iv)  find  F  =  U  I® ,  the  union  over  all  arcs  of  all  the  possible  ray-arc  intersections. 

The  set  F  corresponds  to  the  desired  set  of  intersections  of  the  critical  line  r(co)  with  the 
elements  of  the  frame  of  g(jco,E(Q)) that  are  described  by  arcs  of  circles.  The 
remaining  elements  of  the  set  F  are  of  course  the  set  of  points  the  represent  the 
intersections  of  r(co)  with  the  elements  of  the  frame  of  g(jco,E(Q))  that  are  described  by 
straight-line  segments. 
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Each  of  the  four  steps  outlined  above  can  be  quantified  in  more  precise  terms.  To 
execute  step  (/),  consider  a  ray  r(p0,pl)  and  an  oriented  arc  a(.(a0,a,,a2)  whose 
respective  supporting  line  L(/?0,/?,)  and  supporting  circle  C,  are  given  by  (5.17)  and 
(5.18).  The  set  of  intersections  I(l)  of  the  supporting  line  and  supporting  circle  is 
determined  by  the  real  solutions  of  the  quadratic  algebraic  equation  at2  +bt  +  c  =  0, 

where  a  =  \px  -p0f ,  b  =  2 Re{(/?,  - p0)(p0  - z,)} ,  and  c  =  \p0f  +  |z,.|2  - (r-  +  2 Re{/?0z,}) . 

It  readily  follows  that  there  are  no  intersections  if  the  discriminant  d:  =  b2  -Aac  is 
negative.  Furthermore,  if  the  discriminant  is  zero  the  supporting  line  is  tangent  to  the 
circle  and  there  is  only  one  intersection.  Finally,  there  are  two  intersections  if  the 
discriminant  is  positive.  Now  let  {V,,^}*  denote  the  one  or  two  real  solutions  to  the 
quadratic  equation  for  the  case  where  the  discriminant  is  non-negative.  The  set  of 
intersection  points  I{,)  is  composed  of  the  points  z  in  equation  (5.17)  obtained  by  setting 
t  —  tk,  k=  1,  2.  To  execute  step  (ii)  and  find  7r(,)  c  I0)  it  is  sufficient  to  discard  the 
points  of  7(,)  that  correspond  to  negative  values  of  tk .  To  complete  step  (Hi),  the  set 
I*  c  ^r°  can  t>e  determined  using  the  cross  product  properties.  Consider  an  arc 
a,  (a0,  al ,  a2 )  and  an  intersection  point  y  e  7r(,) .  Then  y  belongs  to  I®  if  and  only  if  the 
arcs  a(.(a0,a,,a2)  and  ^(a^y,^)  are  both  positive  or  both  negative  (i.e.,  both  arcs  turn 
in  the  same  direction).  Finally,  step  (iv)  is  completed  by  setting  the  set  F  as  just  the  union 
of  the  sets  1^  obtained  when  considering  all  the  arcs  C, ,  i  =  1,2,  ••-,*: . 


100 

5.7.  Examples 
Consider  the  system  with  affine  uncertainty  structure 

,      ,       (0.3s3  +  2.2s2  + 105  +  20)  +  (0.12s2  +  0.7s  +  \)q,  +  (0.06s2  +  Q2s)q,  +  (-0.3s  -  l)q,    ,_  ,  ftX 

i?(s  a)  = (5  1 9) 

(s4  +  9.5s3  +  27s2  +  22.5s  +  0.1)  +  (0.5s3  +  2s2  -  s)ql  +  (-0.5s3  +  s2  )q2  +  (0.5s3  +  s)?3 

where  the  parameter  vector  q  =  \_qxq2q^  is  an  element  of  a  rectangular  polytope 
QdR3,  and  where  the  nominal  system  g0(s)  =  g(s,q0)  is  obtained  with  fj=[000]. 
System  (5.19)  is  a  modified  version  of  the  model  investigated  by  Fu  (1990).  The 
objective  is  to  analyze  the  robustness  of  the  unity- feedback  structure  shown  in  Figure  5.1. 
Two  examples  are  given,  namely,  a  case  where  the  critical  uncertainty  value  set  is  convex 
and  one  where  it  is  nonconvex. 
5.7.1.  Example  5.1  -  Convex  Critical  Value  Set 

Consider  the  system  (5.19)  where  the  parametric  uncertainty  vector  belongs  to  the 
square  polytope 

Q={qeR3\  -3  <  q.  <  3,  i  =  1,2,3}  (5.20) 

Figure  5.3  shows  the  frame  g(ja>,E(Q))of  the  uncertainty  value  set  at  the  frequency 
co  =  0.7  .  At  this  frequency  it  is  readily  concluded  by  inspection  that  the  critical  value  set 
^c(<y)is  convex,  since  it  consists  of  a  single  line  segment.  Note  that,  in  contrast,  the 
entire  value  set  V{co)  is  nonconvex. 

In  order  to  apply  Theorem  5.4,  consider  the  frequency  co  =  0.7   and  define  the 
following  elements: 

sTnR  =[1.000    -0.4900],  5^  =  [0.7000    -0.3430]         (5.21a) 

sTdR  =  [1.0000    -0.4900    0.2401],      sTd ,  =  [0.7000    -0.3430]         (5.21b) 
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For  this  problem,  the  constraint  (5.16b)  is 
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(5.21c) 


(5.21d) 


(5.21e) 


(5.21f) 


(5.22) 


(5.23) 


(5.24) 


It  can  be  readily  verified  using  an  active-set  method  (Luenberger,  1984)  that  the  linear 
equality/inequality  problem  (5.16a)-(5.16b)  with  the  data  shown  above  is  infeasible. 
Invoking  Theorem  5.4  it  then  follows  that  -1  +  y'O  €^(co)  and  hence  it  can  be  claimed 


that  that  at  co  =  0.7  the  value  set  V(co)  excludes  the  critical  point. 
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In  order  to  quantify  the  degree  of  robust  stability  at  this  frequency,  it  is  useful  to 
calculate  the  value  of  the  Nyquist  robust  stability  margin.  It  is  possible  to  calculate  pf  (a>) 
following  the  procedure  discussed  in  Section  5.5.  The  first  step  consists  of  finding  the  set 
of  points  F=  {/),  i  =  1,  2,...,  k}  that  define  all  the  intersections  of  the  critical  line  r(co) 
with  the  frame  g(jco,E(Q)) .  It  follows  that  F=  {-0.5185  -j  0.9523,  -0.5494  -j  0.8913, 
-0.5660 -7  0.8584}. 


£  -0.5  ■ 


Figure  5.3.  Frame  of  the  uncertainty  value  set  for  the  system  of 
Example  5.6.1  at  the  frequency  co  =  0.7  .  The  critical  point  -1+/0  and  the 
nominal  point  g0(ja>)  are  represented  by  the  "x"  markers,  the  three 
intersections  of  the  arcs  with  the  critical  line  are  represented  by  the  "+" 
markers,  and  the  intersection  that  defines  the  boundary  point  used  in  the 
calculation  of  pc(co)  is  represented  by  the  "*"  marker.  The  critical  value 
set  %(a))  is  convex  at  this  frequency,  and  it  is  represented  by  a  single 
straight-line  segment. 
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The  elements  of  ®c  (co)  can  be  found  using  the  systematic  procedure  discussed  in 
Section  5.6;  however,  it  is  also  possible  to  identify  the  set  from  Figure  5.3,  where  it  is 
clear  that  only  one  element  of  F,  namely  -0.5660  -  j 0.8584,  lies  on  the  boundary; 
therefore  <Bc(co)  =  {-0.5660  -  j  0.8584}.  As  expected,  <Bc(g>)  contains  only  one  element 
because  Vc{co)  is  convex.  Next,  using  definition  (5.3)  one  finds  that  at  co  =  0.7  the 
critical  perturbation  radius  is  pc(co)  =  0.1694 . 


Figure  5.4.    Critical  perturbation  radius  pc{co)  for  the  system  considered 
in  Example  5.6.1. 

Furthermore,  since  at  this  frequency  g0(jco)  =  -0.4896 -j  1.0096,  it  follows  that 
£N(fi>)  =  0.1 694 /|l- 0.4896- y' 1.0096|=  0.1498  <  1,  which  is  consistent  with  the  claim 
that  at  co  =  0.7  the  value  set  excludes  the  critical  point. 

Figures  5.4  and  5.5  respectively  show  the  values  of  pc(co)  and  k^(co)  calculated 
for  a  grid  of  1 00  nonnegative  frequency  points  that  are  equally  spaced  in  a  logarithmic 
scale  in  the  range  [0.001,  10].  From  Figure  5.5  it  is  readily  concluded  that  kN(co)  <  1  for 
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all  the  frequencies  investigated,  and  it  can  then  be  concluded  from  Theorem  5.2  that  the 
closed  loop  system  is  robustly  stable. 
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Figure   5.5.      Nyquist  robust  stability  margin    k^(co)    for  the  system 
considered  in  Example  5.6.1. 

5.7.2.  Example  5.2  -  Nonconvex  Critical  Value  Set. 

Now  reconsider  the  system  (5.19)  for  the  case  where  the  parametric  uncertainty 
vector  belongs  to  the  rectangular  polytope 


Q={qeRi    -\0<q}<\0,    -  0.3  <  ?2  <  0.3,     -  0.3  <  q3  <  0.3} 


(5.25) 


With  this  modification  to  the  example  discussed  in  Section  5.6.1  the  system  now  has  a 
nonconvex  critical  uncertainty  value  set  Vc(a>)  at  the  frequency  co  =  0.95,  as  can  be 
determined  by  inspection  of  the  frame  g(jco,E(Q))  of  the  value  set  shown  in  Figure  5.6. 
The  figure  clearly  shows  that  as  one  follows  the  critical-direction  ray  from  the  nominal 
point  g0(jco)  to  the  critical  point,  the  ray  realizes  three  intersections  with  the  value  set 


boundary.    The  ray  leaves  the  value  set  immediately  after  the  first  intersection  with  the 
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boundary  occurs,  and  it  does  not  reenter  the  value  set  until  it  realizes  the  second 

intersection  with  the  boundary.      As  a  consequence  of  this  geometry,  the  critical 

uncertainty  value  set  Vc(a>)  is  the  union  of  two  disjoint  line  segments,  and  therefore 

Vc(<y)is  nonconvex. 

When  applying  Theorem  5.4  to  this  system  for  a  particular  frequency,  the  only 

modification  to  the  feasibility  problem  presented  for  the  previous  example  concerns  the 

form  of  the  inequality  constraints  (5.16b)  which  now  adopt  the  form 
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(5.26) 


instead  of  the  form  (5.24).  The  vector  and  matrix  definitions  (5.21c)-(5.21f)  remain 
unchanged.  As  in  the  previous  example,  it  can  be  readily  verified  using  an  active-set 
method  that  the  linear  equality/inequality  problem  (5.16a)-(5.16b)  with  the  data 
corresponding  to  this  example  is  infeasible.  Invoking  Theorem  5.4,  it  then  follows  that 
-1  +  y'O  ^{co) ,  and  hence  it  can  be  claimed  that  that  at  co  =  0.95  the  value  set  excludes 
the  critical  point. 

The  Nyquist  robust  stability  margin  k^(co)  and  the  critical  perturbation  radius 
pc(co)  can  now  be  calculated  at  any  frequency  using  the  method  described  in  Section  5.5. 
At  the  frequency  co  =  0.95 ,  the  set  of  intersections  of  the  critical  line  r{co)  with  the  frame 


g(ME(0)  is 


F=  {-0.6510-./0.3738, -0.4196-./0.6217,  0.6349  -yO.3911, 
.  0.4403  - y0.5995,  -  0.65 1 2  -  y'0.3736,  -  0.6498  -  y'0.375 1} 
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The  elements  of  <8C  (co)  can  be  identified  from  F  using  the  sequential  method  described  in 
Section  5.6  to  yield 

<Bc(o)  =  {-0.6510-7 0.3738,  0.6349 -j 0.391 1,-0.6512  -j 0.3736} 
which  contains  three  elements  due  to  the  nonconvexity  of  Vc(<w) .   Using  equation  (5.4) 


yields 


£(o))=    min   |l  +  z|  =  ll  -  0.6512- /0.3736I  =  0.5111 
bV     '      z&Sc(co)1         '      I  ■'.     •     1 


and  then  invoking  definition  (5.3)  it  is  readily  determined  that 

Pc(cq)  =  |i  -  0.4140-  0.6277y|  -  0.51 1 1  =  0.3475 
at  the  frequency  co  =  0.95  . 


Figure  5.6.  Frame  for  the  uncertainty  value  set  for  the  system  of 
Example  5.6.2  at  the  frequency  co  =  0.9500.  The  critical  point  -1+/0  and 
the  nominal  point  g0(jco)  are  represented  by  the  "x"  markers,  the 
intersections  of  the  arcs  with  the  critical  line  are  represented  by  the  "+" 
markers,  and  the  intersection  that  defines  the  boundary  point  used  in  the 
calculation  of  pc{co)  is  represented  by  the  "*"  marker.  The  critical  value 
set  %(co)  is  nonconvex  at  this  frequency,  and  it  is  represented  by  the 
union  of  two  disjoint  a  straight-line  segments. 
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Furthermore,  since  gQ(jco)  =  -0.4 1 40 -y  0.6277  at  this  frequency,  it  follows  that  k^(w) 

=    0.3475 /|1- 0.41 40 -0.6277y'|    =  0.4047  <  1,    a  result   that   is   consistent  with  the 

conclusion  reached  by  invoking  Theorem  5.4  that  the  value  set  does  not  include  the 
critical  point  at  this  frequency. 
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Figure  5.7.  Critical  perturbation  radius  pc(co)  for  the  system  considered 
in  Example  5.6.2.  An  observed  discontinuity  is  shown  with  a  dashed  line; 
a  second  discontinuity  is  not  visible  in  the  plot  due  to  the  tight  scale  of  the 
ordinate. 

Figures  5.7  and  5.8  respectively  show  the  values  of  pc(co)  and  k^(co)  calculated 
for  a  sequence  of  250  frequency  points  equally  spaced  in  a  logarithmic  scale  in  the  range 
[0.001,  10].  From  Figure  5.8  it  is  readily  concluded  that  &N(<y)<l.  Therefore,  the 
feedback  loop  is  robustly  stable.  Also  notice  from  Figures  5.7  and  5.8  that  pc{co)  and 
ku(a>)  are  discontinuous  at  frequencies  approximately  equal  to  ca  =  0.020  and 
(o  =  0.9417  .  The  presence  of  such  discontinuities  is  not  surprising  since  it  is  well  known 
that  other  stability  margins  (such  as  the  real-//  robust  stability  margin)  have  also  shown 
discontinuities.    The  observed  discontinuities  can  be  explained  by  examining  Figure  5.6, 
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which  shows  the  frame  of  a  value  set  at  the  frequency  co  -  0.9500 .  At  this  frequency  the 
critical  line  intersects  the  boundary  of  the  uncertainty  value  set  at  three  points  (one  point 
near  the  nominal  point  and  two  points  closer  to  -1  +  y'O ). 


0.8 


0.6 


0.4- 


0.2  - 


i     i     tii' 


-I 1 1 1 1 — I—T 


T 


0 
10" 


3  I  I  ' 


10 


10 

CO 


-i 


10 


Figure  5.8.  Nyquist  robust  stability  margin  kN(co)  for  the  system 
considered  in  Example  5.6.2.  Observed  discontinuities  are  shown  with  a 
dashed  line. 

The  critical  uncertainty  value  set  Vc(co)  is  composed  of  two  disjoint  line  segments:  one 
segment  joining  the  nominal  point  with  its  nearest  boundary  point,  and  a  second  segment 
joining  the  other  two  boundary  points.  The  numerical  value  of  pc(a>) is  in  this  case 
calculated  using  the  boundary  point  closest  to  -l+j0.  As  the  frequency  decreases  the 
uncertainty  value  set  rotates  in  a  counterclockwise  direction.  This  causes  a  progressive 
reduction  in  the  length  of  the  line  segment  formed  by  the  two  boundary  points  that  are 
closest  to  -\  +  jO.  Eventually  this  line  segment  collapses  into  a  single  point  in  such  a 
way  that  the  critical  line  passing  through  the  point  is  locally  tangent  to  the  uncertainty 
value  set.    When  this  situation  arises,  the  critical  value  set  Vc(co)  is  composed  of  the 


tangent  point  just  mentioned  and  one  line  segment  (namely,  that  joining  the  nominal  point 


109 

with  its  nearest  boundary  point).  This  effect  is  realized  at  a  frequency  slightly  higher  than 
co-  0.9417  for  the  example  under  consideration,  and  the  value  of  pc(co)  must  be 
calculated  using  the  tangent  point.  As  the  frequency  is  reduced  slightly  below  the  value 
co  =  0.9417  the  critical  line  is  no  longer  tangent  to  the  template,  and  its  intersection  with 
the  value  set  defines  a  single  continuous  segment,  hence,  Vc  (co)  recovers  its  convexity. 
The  calculation  of  pc{co)  is  now  made  using  the  only  existing  boundary  point,  which  is 
now  located  closer  to  the  nominal  point.  Hence,  this  local  decrease  in  frequency  in  the 
neighborhood  of  co  -  0.9417  requires  that  the  boundary  point  selected  for  the  calculation 
of  pc(co)  be  changed  from  the  tangent  point  (located  closer  to  -1  +  y'O)  to  a  non- 
neighboring  point  that  is  located  closer  to  the  nominal  point;  this  explains  the 
discontinuity  pc(co)  which  in  turn  causes  kN(co)  to  be  discontinuous. 

5.8.  Conclusions 

The  main  contribution  of  this  paper  is  the  generalization  of  the  critical  direction 
theory  proposed  to  analyze  the  robust  stability  of  systems  whose  critical  uncertainty  value 
sets  are  nonconvex.  The  generalization  is  obtained  by  introducing  a  new  definition  of  the 
critical  perturbation  radius,  and  the  effectiveness  of  the  generalized  theory  is  validated  by 
its  success  in  assigning  a  quantitative  robust  stability  measure,  namely,  a  computable 
Nyquist  robust  stability  margin,  to  problems  that  involve  affine  parametric  uncertainties 
characterized  by  real  vectors  that  belong  to  a  rectangular  polytope.  For  the  case 
considered,  the  calculation  of  the  Nyquist  robust  stability  margin  involves  planar 
geometry  operations  and  solving  a  series  of  linear  equality/inequality  feasibility  problems 
that  do  not  pose  major  computational  challenges. 


CHAPTER  6 

ROBUST  CONTROLLER  SYNTHESIS  FOR  SYSTEMS  WITH  NONCONVEX 

VALUE  SETS  USING  AN  EXTENSION  OF  THE  NYQUIST  ROBUST  STABILITY 

MARGIN 

6.1.  Introduction 

The  stability  analysis  of  feedback  control  systems  in  the  presence  of  modeling 
uncertainty  is  the  subject  of  extensive  studies;  in  particular,  the  class  of  structured 
uncertainties  where  the  parameters  of  a  transfer  function  vary  in  prescribed  real  intervals 
is  relevant  to  many  engineering  applications.  Early  advances  in  this  field  are  due  to  the 
well-known  theorem  by  Kharitonov  (1979)  that  gives  conditions  for  the  stability  of 
polynomial  systems  with  coefficients  that  belong  to  a  rectangular  polytope.  Extensions  of 
Kharitonov's  work  to  rational  functions  make  use  of  standard  frequency-domain 
techniques  such  as  Nyquist  plots  and  the  small  gain  theorem.  These  methods  are  based 
on  determining  the  stability  of  a  set  of  Kharitonov  plants  (or  extreme  plants)  derived  from 
an  interval  plant  description  where  each  coefficient  of  the  numerator  and  denominator 
polynomial  varies  in  a  fixed  interval.  The  number  of  extreme  plants  required  varies  with 
the  technique  utilized.  Chapellat  et  al.  (1989)  suggest  a  method  which  involves  checking 
the  stability  and  calculating  HM  norms  along  a  finite  number  (at  most  32)  of  extreme 
segments  called  Kharitonov  segments.  Barmish  et  al.  (1992)  prove  that  it  is  necessary 
and  sufficient  that  sixteen  of  the  extreme  plants  be  stable,  and  under  certain  conditions 
only  eight  or  twelve  are  necessary.  Bartlett  et  al.  (1990)  give  conditions  that  use  32  one- 
dimensional  subsets  of  the  interval  plant. 
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Available  results  for  interval  plant  descriptions  enable  the  development  of 
computationally  manageable  methods  for  solving  a  wide  range  of  robust  stability  analysis 
problems.  However,  it  is  usually  the  case  that  each  plant  coefficient  depends  on  more 
than  one  uncertain  parameter.  One  such  uncertainty  description  occurs  when  the  plant 
coefficients  are  affine  in  the  uncertain  parameters.  Unfortunately,  the  extension  of 
Kharitonov's  approach  to  the  case  of  affine  uncertainties  is  not  straightforward. 
Nevertheless,  Fu  (1990)  presents  comprehensive  results  that  are  useful  for  quantifying  an 
entire  uncertainty  value  set  in  the  Nyquist  plane  for  plants  with  affine  parametric 
uncertainties.  A  remaining  challenge  is  to  utilize  these  results  to  produce  a  scalar 
measure  of  robustness  analogous  to  the  well  known  structured  singular  value  (Doyle, 
1982)  and  the  multivariable  stability  margin  (Safonov,  1982)  paradigms. 

The  previous  chapter  proposes  an  alternative  method  that  is  applicable  to  both 
interval  and  affine  perturbations  and  is  based  on  using  the  Nyquist  robust  stability  margin 
ku(a>)  as  a  measure  of  robust  stability.  The  technique  extends  the  critical-direction 
theory  developed  by  Latchman  and  Crisalle  (1995)  and  Latchman  et  al.  (1997)  by 
considering  nonconvex  critical  uncertainty  value  sets.  A  general  definition  of  the  critical 
perturbation  radius  pc(co)  used  in  the  calculation  of  kn{a>)   is  proposed  to  take  into 

account  nonconvex  critical  uncertainty  value  sets.  The  new  general  theory  is  applied  to 
the  case  of  systems  with  real  parametric  affine  uncertainties.  Earlier  results  of  Fu  (1990) 
are  combined  with  an  explicit  map  from  the  parameter  space  to  the  Nyquist  plane  to 
calculate  the  required  critical  perturbation  radius  with  high  precision  and  efficiency. 

In  this  chapter,  a  practical  design  approach  based  on  parameter  space  methods 
(Siljak,  1989)  is  proposed  to  illustrate  the  utility  of  the  Nyquist  robust  stability  margin  as 
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a  measure  of  robust  stability.  The  process  consists  of  two  stages.  First,  the  domain  of 
controller  parameters  that  result  in  robustly  stable  closed-loop  systems  is  determined. 
Then,  in  a  second  stage  the  set  of  optimal  robustly  stabilizing  controller  parameters  is 
obtained  by  optimizing  a  performance  functional  over  the  domain  found  in  the  previous 
stage.  The  result  is  a  robustly  stabilizing  controller  with  specific  optimal  performance 
characteristics.  The  Section  6.2  introduces  the  design  methodology.  Then,  Section  6.3 
gives  a  design  example  based  on  this  methodology.  Concluding  remarks  are  made  in  the 
final  section. 

6.2.  Design  Methodology 
The  robust  stability  analysis  technique  of  the  previous  Chapter  based  on  the  Nyquist 
robust  stability  margin  can  now  be  utilized  in  the  design  of  robustly  stabilizing 
controllers.  A  common  objective  in  robust  systems  synthesis  is  to  design  a  controller  that 
is  robustly  stable  and  satisfies  a  nominal  performance  criterion.  This  is  usually  called 
robust  performance  in  the  control  literature.  The  robust  stability  of  a  controller  can  be 
determined,  for  example,  from  the  Nyquist  robust  stability  margin  plot  across  frequency 
for  the  system  containing  the  designed  controller  and  the  uncertain  plant.  If  the  system  is 
robustly  stable  and  nominal  performance  is  not  required,  then  no  further  synthesis  is 
necessary.  However,  if  nominal  performance  is  requested,  further  design  work  is 
required.  To  design  for  robust  performance  it  is  necessary  to  characterize  the  set  of 
robustly  stabilizing  controllers  and  performance  objectives  in  a  compatible  manner. 

In  this  chapter  a  practical  two  step  design  approach  based  on  parameter  space 
methods  (Siljak,  1989)  is  proposed.  First  the  controller  parameters  that  result  in  robustly 
stable  closed-loop  systems  are  determined.  Then,  a  performance  objective  is  optimized 
over  the  set  of  robustly  stabilizing  controller  parameters,  resulting  in  a  robustly  stabilizing 
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controller  with  a  guaranteed  level  of  performance.  Note  that  the  parameter  space 
considered  here  refers  to  the  controller  parameters  rather  than  to  the  uncertain  plant 
parameters. 

The  set  of  robustly  stable  controllers  in  the  parameter  space  is  implicitly 
determined  by  its  associated  "robust  stability  boundary",  that  is,  the  level  sets  in  the 
parameter  space  that  separate  robustly  stable  and  unstable  controllers.  These  are 
manifolds  where  the  Nyquist  robust  stability  margin  kN  is  equal  to  one.  In  general,  it  is 
not  possible  to  find  closed  form  expressions  that  describe  these  regions  in  the  space  of 
controller  parameters.  Therefore,  an  efficient  search  over  the  parameter  space  is  often 
necessary.  Since  nominal  stability  is  necessary  for  robust  stability  this  search  can  be 
restricted  to  the  subspace  of  nominally  stable  controllers. 

Since  nominal  stability  is  necessary  for  robust  stability  and  it  is  therefore  possible 
to  parameterize  all  the  nominally  stable  systems  about  a  specified  stabilizing  controller,  it 
is  possible  to  initialize  the  search  with  a  feasible  point.  In  this  paper  we  propose  a  simple 
search  strategy  where  robust  stability  is  evaluated  at  a  representative  set  of  values  of  the 
controller  parameters,  and  the  robust  stability  for  the  points  lying  in  between  is 
determined  by  interpolation.  Since  the  regions  describing  robustly  stable  controllers  are 
not  simply  connected,  this  may  lead  in  general  to  an  exhaustive  search  of  the  controller 
parameters  space 

One  of  the  main  limitations  of  parameter  space  methods  is  that  it  is  difficult  to 
characterize  regions  of  nominal  and  robust  stability  when  the  number  of  controller 
parameters  is  greater  than  three.  Therefore,  we  will  focus  on  fixed  order  controllers  that 
have  at  most  three  free  parameters.  An  important  class  of  controllers  widely  used  in 
industry  that  can  be  effectively  designed  using  this  approach  are  the  PH>  controllers.  To 
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illustrate  our  design  procedure  we  will  consider  the  proportional-integral  (PI)  controller, 
which  has  the  advantage  of  elimination  of  offset.  The  transfer  function  of  a  PI  controller 
is  given  by 

1  ^ 


c(s)  =  Kc 


1  +  - 
V       TisJ 


where  Kc  and  r7  are  the  controller  parameters. 

The  first  step  of  the  controller  synthesis  is  to  identify  the  regions  in  parameter 
space  of  the  controllers  that  result  in  nominal  stability.  For  PI  controllers  this  can  be 
achieved  through  a  D-partition  of  the  complex  plane  of  the  controller  parameters 
(Kiselev,  1997),  or  by  checking  the  close-loop  poles  of  the  system  for  the  representative 
set  of  controller  parameters.  These  regions  can  be  plotted  in  the  parameter  space  with  the 
x-axis  being  1  /  r,  and  the  y-axis  being  Kc .  The  manifolds  that  separate  these  regions  in 
the  parameter  space  will  be  called  the  "nominal  stability  boundary". 

Next,  the  Nyquist  robust  stability  margin  is  calculated  for  each  point  in  the 
representative  set  of  controller  parameters  that  are  nominally  stabilizing.  From  these 
values  of  the  Nyquist  robust  stability  margin  it  is  possible  to  plot  level  sets  for  constant 
values  of  the  Nyquist  robust  stability  margin;  the  most  important  being  the  level  set 
corresponding  to  a  Nyquist  robust  stability  margin  value  of  1 .  The  regions  in  parameter 
space  with  Nyquist  robust  stability  margin  less  than  one  correspond  to  robustly  stabilizing 
controllers.  While  it  is  possible  to  determine  the  robust  stability  regions  using  a  D- 
partition  on  the  uncertain  system,  there  is  no  measure  of  robust  stability  for  points  inside 
these  regions  (Kiselev,  1997).  Furthermore,  the  nominal  stability  region  and  the  robust 
stability  region  need  not  be  connected. 
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The  final  step  in  the  controller  synthesis  is  to  optimize  some  measure  performance 
over  the  robustly  stabilizing  controllers  to  determine  the  final  controller.  Three  classical 
measures  of  performance  will  be  considered: 

Integral  of  the  squared  error  (ISE) 

CO 

ISE  =  j[e(t)fdt 

0 

Integral  of  the  absolute  value  of  the  error  (IAE) 

00 

IAE  =  \\e(t)\dt 

0 

Integral  of  the  time-weighted  absolute  error  (ITAE) 

00 

ITAE  =  jt\e(t)\dt 

0 

The  error  signal  e(t)  is  the  difference  between  the  set  point  and  the  measurement.  In 
addition,  the  measures  are  base  on  an  error  signal  resulting  from  a  servo  test  of  a  unit-step 
set-point  change  (Seborg,  1989). 

A  search  over  the  set  of  robustly  stabilizing  controllers  is  necessary  to  find  a 
controller  that  maximizes  a  desired  performance  measure.  There  are  several  ways  of 
performing  this  search.  One  approach  is  to  calculate  the  performance  measure  at  the 
same  time  as  the  Nyquist  robust  stability  margin  is  calculated  and  plot  both  the  level  sets 
of  constant  stability  margin  and  the  constant  performance  measure.  A  controller  that 
maximizes  both  stability  robustness  and  nominal  performance  can  then  be  determined 
from  this  plot.  Another  approach  is  to  search  along  constant  Nyquist  robust  stability 
margin  contours  by  computing  the  performance  measure  along  a  desired  contour  that  is 
robustly  stabilizing  (i.e.,  less  than  1).  This  leads  to  a  controller  that  maximizes  nominal 
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performance  for  a  desired  level  of  stability  robustness.  The  advantage  of  this  approach  is 
that  the  performance  measure  does  not  need  to  be  calculated  for  all  values  of  the 
controller  parameters. 

An  alternative  approach  to  robust  controller  design  is  to  perform  a  direct 
parametric  optimization  of  the  performance  objective  with  robust  stability  conditions 
applied  as  constraints.  Since  in  general  this  optimization  problem  is  non-convex  in  the 
objective  function  and  the  constraints  are  non-convex  and  not  connected  over  the 
controller  parameters  the  solution  will  also  required  an  exhaustive  search  of  the  parameter 
space  (Luenberger,  1984). 

6.3.  Design  Example 

Consider  the  uncertain  system 

^  ,** +(4  +  04,, -KX^  +  PO  +  g, -ft)  (6.!) 


where 


d(s,  q)  =  s4  +  (9.5  +  05ql  -  0.5q2  +  0.5q,  )s3 

+(27  +  2qx  +  q2  )s2  +  (22.5  -q]+q3)s  +  0.1 


and 


(^1.^2.^3)  €0={(^p^2»93)  I    -3<?,<3,    /  =  1,2,3} 

System  (6.1)  is  a  modified  version  of  the  model  investigated  by  Fu  (1990).  Note  that  the 
transfer  function  coefficients  depend  affinely  on  the  parameters  q  e  Q .  Using  the 
proposed  synthesis  technique,  a  robust  stabilizing  PI  controller  will  be  designed.  A  unity 
feedback  structure  is  obtained  by  defining  g(s,q)  =  c(s)p(s,q) .  The  first  step  is  to 
determine  the  nominal  and  robust  stability  regions  in  terms  of  the  controller  parameter 
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space  Kc  and  r, .  For  this  example  the  D-partition  method  of  (Kiselev,  1997)  is  used  to 
determine  the  nominal  stability  boundary.  Then,  the  robust  stability  boundary  is  found 
for  a  representative  set  of  nominal  stabilizing  controller  parameters,  by  determining  the 
unity  Nyquist  robust  stability  margin  contour.  The  results  are  shown  in  Figure  6. 1 . 
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Figure  6.1 .  Stability  regions  with  nominal  (continuous  line)  and  robust 
(dashed  line)  stability  boundaries.  The  dashed  lines  correspond  to 
contours  where  &N  =  1 . 

Following  the  first  design  method  proposed,  the  performance  measure  is  calculated  at 

each  value  of  the  controller  parameters.  The  resulting  contours  are  shown  in  Figures  6.2, 

6.3,  and  6.4  for  each  of  the  performance  objectives.    Notice  that  the  IAE  and  ITAE 

minimum,  Figures  6.3  and  6.4,  occur  within  the  robust  stability  region.    This  indicates 

that  the  controllers  designed  for  nominal  IAE  and  ITAE  performance  are  also  robustly 

stabilizing.    In  contrast,  the  controller  suggested  by  the  ISE  criterion,  Figure  6.2,  is  not 
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robustly  stabilizing.  Computation  of  the  Nyquist  stability  margin  verifies  this  as  is  shown 
in  Table  6.1.  The  ISE  minimum,  Figure  6.2,  occurs  outside  the  robust  stability  region, 
and  therefore  is  not  robustly  stabilizing.  These  results  are  as  expected,  because  in  general 
the  controllers  designed  using  the  ITAE  performance  criteria  are  more  conservative,  i.e. 
less  aggressive,  than  those  designed  using  the  IAE  performance  criteria,  which  in  turn  are 
more  conservative  than  controllers  designed  using  the  ISE  criteria.  Since  the  ISE  design 
is  more  aggressive,  it  is  less  likely  to  be  robustly  stabilizing,  as  is  shown  in  the  example. 


1/T 

Figure  6.2.  ISE  contours  superimposed  on  the  robust  stability  boundary. 

The  second  proposed  design  method  can  now  be  used  to  design  controllers  that 
have  a  greater  degree  of  robust  stability.  This  is  done  by  minimizing  the  performance 
measures  along  a  constant  robust  Nyquist  stability  margin  contour  less  than  1 . 
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Table  6. 1 .  Controllers  resulting  from  minimizing  the  three  performance  criteria  with 
stability  results. 


Performance 
Criteria 

Optimal  Kc 

Optimal  Tj 

«N 

Robustly 
Stabilizing 

ISE 

1.8738 

13.8489 

1.3157 

No 

IAE 

0.9326 

5.4623 

0.9081 

Yes 

ITAE 

0.4642 

3.7649 

0.8306 

Yes 

For  the  case  of  the  ISE  performance  measure,  the  resulting  controller  will  be  robustly 
stabilizing,  which  was  not  the  case  for  the  previous  method.  As  for,  the  IAE  and  ITAE 
performance  measures,  it  is  necessary  to  optimize  along  robust  Nyquist  stability  margin 
contours  less  than  0.9081  and  0.8306,  respectively,  to  achieve  a  greater  degree  of  robust 
stability. 
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Figure  6.3.  IAE  contours  superimposed  on  the  robust  stability  boundary. 
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Table  2.  Controllers  resulting  from  minimizing  the  three  performance  criteria  along  the 
0.8  robust  Nyquist  stability  margin  contour. 


Performance 
Criteria 

Optimal  Kc 

Optimal  r , 

*N 

Robustly 
Stabilizing 

ISE 

0.9609 

8.5317 

0.8 

Yes 

IAE 

0.6164 

4.5204 

0.8 

Yes 

ITAE 

0.4894 

3.8476 

0.8 

Yes 

To  this  end,  the  values  of  the  performance  measures  along  the  Nyquist  robust  stability 
margin  contour  of  0.8  are  calculated  and  the  optimizing  controllers  are  shown  in  Table  2. 
A  comparison  of  Table  6.1  and  Table  6.2  shows  that  to  achieve  the  greater  degree 
of  robust  stability  {i.e.  a  robust  Nyquist  stability  margin  value  of  0.8)  the  ISE  and  IAE 
controllers  designed  using  the  second  method  require  smaller  values  of  the  proportional- 
gain  Kc  and  integral  time  constant  r, . 
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Figure  6.4.  ITAE  contours  superimposed  on  the  robust  stability  boundary. 
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As  for  the  ITAE  performance  criteria,  the  controllers  designed  using  the  two  methods  are 
nearly  identical  given  that  the  changes  in  Nyquist  robust  stability  margin  values  between 
Tables  6.1  and  6.2  are  small.  The  ISE  and  IAE  controllers  designed  using  the  second 
method  can  be  considered  more  conservative  than  the  controllers  design  using  the  first 
method,  because  the  relative  decrease  in  Kc  (increasing  conservatism)  is  larger  than  the 
relative  decrease  in  r ,  (decreasing  conservatism).  It  is  not  practical  to  determine  relative 
conservatism  between  the  ITAE  controllers,  considering  the  small  difference  between  the 
two. 

6.4.  Conclusion 
The  Nyquist  robust  stability  margin  kN(co)   is  an  effective  scalar  measure  of 

robust  stability.  The  main  contribution  of  this  paper  is  the  introduction  of  a  general 
definition  of  the  critical  perturbation  radius  pc(a>)  that  can  account  for  non-convex 
critical  uncertainty  value  sets.  This  generalization  of  the  critical  direction  theory  is 
illustrated  for  systems  with  affine  parametric  uncertainty  for  which  the  critical 
perturbation  radius  can  be  calculated  precisely  and  efficiently  with  no  computational 
issues.  The  computation  of  the  Nyquist  robust  stability  margin  involves  planar  geometry 
operations  and  solving  linear  equality/inequality  feasibility  problems.  A  parameter  space 
design  method  for  robust  performance  using  the  Nyquist  robust  stability  margin,  is  also 
demonstrated. 


CHAPTER  7 

ROBUSTNESS  OF  CLASSICAL  PROPORTIONAL-INTEGRAL 

CONTROLLER  DESIGN  METHODS 

7.1.  Introduction 

The  classical  proportional-integral-derivative  controllers  in  use  since  the  early  40s 
are  now  all-pervasive  in  industrial  applications.  In  particular,  the  proportional-integral 
(PI)  version  of  this  classical  controller  has  found  widespread  application  in  processes 
where  the  presence  of  measurement  noise  does  not  permit  taking  advantage  of  the 
beneficial  effects  of  the  derivative  action.  PI  control  ensures  offset-free  performance,  and 
through  adequate  tuning  choices  is  able  to  deliver  aggressive  or  sluggish  responses,  as 
desired  by  the  control  designer.  Petroleum  refining  plants  may  include  as  many  as  two 
thousand  PI  control  loops,  while  other  applications,  such  as  the  precise  regulation  of  force 
at  the  tip  of  the  stylus  of  an  atomic-force  microscope,  may  involve  a  single  such  loop. 

A  PI  controller  that  is  poorly  tuned  may  render  the  loop  inherently  unstable,  and  in 
a  practical  application  this  instability  often  leads  to  a  permanent  saturation  of  the 
manipulated  actuator,  thus  rendering  the  controller  completely  ineffectual.  Unfortunately, 
when  the  process  involves  a  large  number  of  PI  loops,  instability-induced  saturations  due 
to  inadequate  tuning  may  remain  largely  unnoticed  as  long  as  a  subset  of  key  control 
loops  remain  saturation  free.  The  net  effect  is  a  decrease  in  the  overall  performance  of 
the  entire  control  system  because  the  unstable  loops  are  unable  to  contribute  their  share  to 
improving  the  dynamic  quality  of  the  variables  they  manipulate.  Since  PI  controllers  are 
tuned  using  a  number  of  classical  correlations  stemming  back  to  the  famous  Ziegler- 
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Nichols  methods  first  proposed  in  1942  (Ziegler  and  Nichols,  1942),  it  is  therefore 
relevant  to  analyze  the  robust-stability  of  such  tuning  correlations.  This  robust-stability 
problem  has  not  been  previously  studied  in  the  literature,  but  recent  advances  in  the 
theory  of  robust  control  analysis,  in  particular  dealing  with  real  parametric  uncertainties, 
now  make  it  possible  to  address  the  issue  in  a  quantitative  fashion. 

Since  the  appearance  of  Ziegler  and  Nichols'  seminal  work,  a  vast  literature  has 
emerged  on  alternative  approaches  for  tuning  controllers  of  the  PI  type  under  the 
assumption  that  the  process  being  controlled  is  adequately  described  by  the  open-loop 
stable  first-order-plus-delay  model 

Ke-°s 
P(s)  =  ^--  (7.1) 

TS+l 

where  K  is  the  process  gain,  z  >  Ois  the  time  constant,  and  6  >  0  is  the  time  delay.  The 
three  model  parameters  can  be  obtained  from  open-loop  step-response  tests,  from 
statistical  parameter-estimation  methods,  model  reduction  techniques,  etc.  (Seborg  et  al, 
1989),(Wallen,  1999). 

Among  the  large  number  of  correlations  in  use  today  based  on  the  first-order-plus- 
dead-time  process  paradigm  and  found  in  most  introductory  texts  on  control  engineering 
are  the  widely-cited  prescriptions  proposed  by  Cohen  and  Coon  (1953),  who  tuned 
controllers  using  the  criterion  that  good  performance  is  attained  when  the  response 
realizes  a  one-quarter  decay  ratio.  Also  very  often  used  are  the  correlations  based  on 
error-integral  criteria  proposed  and  developed  by  Lopez  et  al.  (1967),  Murrill  (1967), 
Rovira  et  al.  (1969),  and  Smith  and  Corripio  (1985),  among  others,  and  that  include 
performance  criteria  such  as  the  integral  of  the  absolute  value  of  the  error  (LAE),  the 
integral  of  the  squared  error  (ISE),  and  the  integral  of  the  time-averaged  absolute  error 
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(ITAE).  A  summary  of  relevant  correlations  is  given  in  Section  7.2  of  this  paper.  All  of 
the  tuning  correlations  mentioned  have  been  carefully  developed  by  their  authors  to 
approximately  satisfy  specific  performance  criteria,  and  to  ensure  that  the  closed  loop  is 
stable.  More  recent  work  on  similar  controller  synthesis  methods  is  given  by  Schei 
(1994),  Langer  and  Landau  (1999),  and  Kristiansson  and  Lennartsson  (1999),  but  these 
design  methods  are  not  further  explored  in  this  paper,  because  the  earlier  works  are  more 
often  cited  and  are  therefore  more  reasonable  candidates  for  the  robust  analysis  technique 
presented. 

Although  in  all  cases  the  tuning  correlations  recognize  that  model  (7.1)  is  an 
approximation,  they  do  not  address  the  fact  that  the  process  of  parameter  identification  is 
inherently  affected  by  uncertainties.  More  specifically,  the  gain,  time-constant,  and  time- 
delay  parameters  may  be  respectively  affected  by  errors  AK ,  At  ,  and  AO .  This  paper 
seeks  to  quantify  the  robust  stability  of  classical  tuning  correlations  for  PI  controllers  with 
respect  to  uncertainties  in  the  gain,  time-constant,  and  time-delay  parameters  of  the 
model. 

A  major  objective  of  this  paper  is  the  calculation  of  a  meaningful  parametric 
stability  margin  for  this  class  of  systems,  which  can  then  be  used  as  a  quantitative  metric 
to  compare  alternative  PI  controller  tuning  rules  with  respect  to  robust  stability.  While 
standard  gain  and  phase  margins  may  also  be  used  to  get  some  idea  of  robustness,  it  is 
well  known  that  gain  and  phase  margins  may  be  fragile  safeguards  in  the  presence  of 
system  uncertainties,  as  for  example  in  case  of  the  state  feedback  design  problem.  The 
results  of  this  paper  provide  mechanisms  for  rigorously  resolving  these  robustness  issues 
for  the  case  of  PI  tuning  controllers  for  systems  approximated  by  first-order-plus-time- 
delay  models. 
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The  rest  of  the  chapter  is  organized  as  follows.  Section  7.2  introduces  the  closed 
loop  feedback  system,  including  the  uncertain  first-order-plus-time-delay  process  and  the 
proportional-integral  controller,  and  also  provides  a  general  discussion  of  tuning  rules. 
The  theory  necessary  for  analyzing  robust  stability  is  given  in  Section  7.3,  where  the  zero- 
exclusion  principle  is  applied  to  the  uncertain  closed-loop  quasi-polynomial,  resulting  in 
a  general  characterization  of  the  set  of  stable  perturbations.  Section  7.3  also  introduces 
measures  of  robust  stability,  including  the  gain  margin,  phase  margin,  and  a  novel 
parametric  stability  margin.  In  Section  7.4  the  results  of  Section  7.3  are  applied  to 
determine  the  set  of  stabilizable  perturbations  for  the  ITAE  regulation  tuning  rule  and  the 
parametric  stability  margin  results  are  given  for  a  variety  of  tuning  rules.  Finally, 
concluding  remarks  are  made  in  Section  7.5. 

7.2.  Preliminaries 

The  uncertain  process  and  the  PI  controller  are  arranged  in  the  standard  feedback 
configuration  shown  in  Figure  7.1,  where  the  variable  v(s)  is  the  process  output,  r(s)  is  the 
set  point,  d(s)  is  an  additive  disturbance,  and  e(s)  =  r(s)  -y(s)  is  the  feedback  error.  The 
uncertainty  in  the  model  and  the  structure  and  tuning  correlations  for  the  controller  are 
discussed  next. 


r  e 


c(s) 


P(s;q) 
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Figure    7.1.       Feedback    control    structure    with    proportional-integral 
controller  c(s)  and  uncertain  process p(s;  q). 
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7.2.1.  Process  Model  and  Uncertainty  Description 

For  the  purposes  of  this  study,  it  is  convenient  to  represent  the  uncertain  first- 
order  plus  delay  model  as 

Ke-05 
Pis\q)  = (7.2) 

TS  +  \ 

where  the  process  gain  K,  the  time  constant  r  >  0 ,  and  is  the  time  delay  0  >  0  are  real 
parameters,  and  where  the  uncertainty  in  the  parameters  is  expressed  in  the  multiplicative 
form 

K  =  aKK0  (7.3a) 

T  =  arr0  (7.3b) 

e  =  ae60  (7.3c) 

where  Kq  *  0 ,  tq  >  0 ,  and  Oq  >  0  are  the  known  nominal  values  of  the  process 
parameters,  and  the  scalars  aj(  >0,  aT  >  0 ,  and  ag>0  are  unknown  real  multiplicative 

T  3 

perturbations  that  are  collected  in  the  uncertainty  vector  q  =  [a^,aT,ag]    e  QaVi   , 

which  belongs  to  an  uncertainty  domain  Q  composed  of  vectors  with  strictly  positive 
elements.    The  nominal  process  p0(s):=  p(s;q0)   is  recovered  from  (7.2)  after  setting 

q  =  q0:=[l  1  l]r  to  yield 

n>(.)-5*-4-  (7.4) 

Note  that  the  real   multiplicative  parametric   uncertainty  description  (7.3)  has  the 

associated  additive  perturbations 

AK:=KQ(aK-\)  (7.5a) 

Ar:=r0(ar-1)  (7.5b) 
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A0:=0o(ao-l)  (7.5c) 

so  that  K  =  K0  +  AK ,  t  =  t0  +  At  ,  and  0  =  0O  +  A0 .  Also,  a  scaled  additive  perturbation 
is  defined  as 


Aq:  = 


AK     At     A0^ 


Kq      t0      0c 


such  that  q  =  q0  +  Aq .  The  strict  positivity  of  the  multiplicative  parameter  a^  in  (7.3a) 
implies  that  the  sign  of  the  gain  is  invariant  in  Q,  a  constraint  that  is  normally  met  in 
applications  since  often  the  sign  of  the  gain  is  known  from  the  physics  of  the  underlying 
problem  or  from  experience.  Even  though  it  is  possible  to  extend  the  analysis  to  include 
negative  values  of  a^,  that  case  is  not  of  practical  interest  and  is  therefore  not 
considered  further  in  this  study. 

7.2.2.  Proportional-Integral  Control  and  Controller  Tuning  Rules 

The  proportional-integral  controller  considered  in  Figure  7.1  is  of  the  classical  form 


(        1   >\ 


c(s)  =  Kc 


1  + 


T,Sj 


(7.6) 


where  the  controller  gain  Kc*0  and  the  integral  time-constant  r ,  >  0  are  adjustable 
parameters.  The  PI  controller  (7.6)  ensures  the  offset-free  behavior  of  the  closed  loop  in 
both  the  standard  servo-control  problem  (tracking  of  step  changes  in  the  set  point  in  the 
absence  of  disturbances)  and  in  the  standard  regulation  problem  (rejection  of  step 
changes  in  the  disturbance  while  the  set  point  remains  constant).  Tuning  consists  of 
prescribing  values  of  the  control  parameters  that  ensure  satisfying  a  specific  performance 
criterion  of  interest  to  the  control  designer. 


128 

Some  of  the  historically  and  practically  most  important  tuning  correlations  are 
given  in  Table  7.1,  where  the  prescribed  values  of  control  gain  and  integral  time-constant 
are  expressed  through  equations  that  require  knowledge  of  the  known  nominal  process 
parameters  KQ,  r0,and  9Q  of  the  nominal  model  (7.4).  The  tuning  correlations  proposed 
by  Ziegler  and  Nichols  (1942)  and  those  by  Cohen  and  Coon  (1953)  shown  in  the  table 
are  both  based  on  the  goal  of  achieving  a  quarter-decay  ratio  in  the  regulation  response  of 
the  loop.  Lopez  et  al.  (1967),  and  Rovira  et  al.  (1969)  developed  the  tuning  rules  shown 
in  the  table  that  seek  to  minimize  specific  integral-error  criteria  of  the  servo  or  regulation 

response,       including       the       IAE:=  \q  \e{t)\  dt ,        ISE:=\^e{t)  dt ,       and       the 

ITAE:-  \q  t\e(t)\dt  performance  measures.  Note  that  the  entries  in  the  table  depend  on 
the  time-delay-to-time-constant  ratio  60 1  r0 ,  a  fact  that  is  exploited  later  in  the  paper. 
These  tuning  correlations  have  been  developed  and  tested  for  nominal  models  in  the 
range  0.1  <60 1  r0  <  1.0 ,  which  represents  a  very  wide  range  of  practical  processes  of 
interest.  After  invoking  any  of  these  tuning  correlations,  the  parameters  of  the  PI 
controller  (7.6)  could  be  written  in  the  form  Kc  =  Kc(Kq,0q/  r0)  and 
Tj  =  Tj(tq,0q/  tq)  where  the  specific  functional  dependencies  are  given  by  the  table 
entries,  and  hence  it  follows  that  c(s)  =  c(s; qo)  in  a  very  specific  sense.  For  simplicity 
of  notation,  however,  in  the  sequel  the  controller  is  denoted  as  c(s)  and  its  parameters  are 
denoted  as  Kc  and  Tj,  since  their  dependence  on  the  nominal  process  parameters  is 
implicitly  understood. 

It  is  well  known  that  in  the  absence  of  parametric  uncertainty,  the  Ziegler-Nichols 
and  the  Cohen-Coon  methods  tend  to  yield  aggressive  and  oscillatory  responses,  a 
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behavior  consistent  with  their  explicit  goal  of  attaining  one-quarter  decay  ratios  in  the 
response.  The  ITAE-prescribed  settings,  in  contrast,  tend  to  produce  slower  responses. 
Simulation  studies  can  be  carried  out  to  suggest  that  the  ITAE  settings  are  often  more 
robust  than  those  proposed  by  the  IAE  and  ISE  methods,  but  no  rigorous  quantitative 
evidence  has  been  provided  in  the  previous  literature.  This  paper  seeks  to  quantify  the 
relative  stability  robustness  of  all  the  tuning  correlations  represented  in  Table  1  by  means 
of  a  parametric  stability  margin  whose  computation  is  one  of  the  main  objectives  of  this 


paper. 


Table  7.1.   Controller  tuning  correlations  for  proportional-integral  controllers  for 
the  servo  and  the  regulation  problems. 


Method 

Problem 

K0KC 

tJt, 

Reference 

Ziegler-Nichols 

Servo 
control 

o.9(e0/T0yl 

O.3(0o/ro)~* 

(Ziegler,  1942) 

Cohen-Coon 

Servo 
control 

-±-  +  0.9(60/r0yl 

20  +  9^/ro)-1 
3O  +  3(0o/ro) 

(Cohen,  1953) 

IAE 

Servo 
control 

O.758(0O/  ror0861 

l.O2-O.323(0o /r0) 

(Rovira,  1969) 

ITAE 

Servo 
control 

O.586(0o/ro)"°916 

1.03-0.165(00 1  To ) 

(Rovira,  1969) 

ISE 

Regulation 
control 

1.3O5(0o/ror6° 

O.492(0o /r0)-°739 

(Lopez,  1967) 

IAE 

Regulation 
control 

O.984(0O  /  r0)-°986 

O.6O8(0O /r0)"°707 

(Lopez,  1967) 

ITAE 

Regulation 
control 

O.859(0O/  r0)-0977 

O.674(0o  /r0)-°68° 

(Lopez,  1967) 

7.3.  Analysis  of  Robust  Stability 
7.3.1.  Conditions  for  Robust  Stability 

Straightforward  block-diagram  algebra  operations  show  that  the  stability  of  the 
closed-loop  system  of  Figure  7.1  is  determined  by  the  properties  of  the  quasipolynomial 
of  degree  2  (Bhattacharyya,  1995) 
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S(s;q)  =  aKKoKc(Tjs  +  \)e'a06oS  +  aTT0TjS2  +  t,s  (7.7) 

where  q  e  Q,  and  where  the  PI  control  parameters  Kc  and  Tj  are  selected  using  a  tuning 
correlation  taken  from  Table  7.1  and  are  therefore  functions  of  qg.  It  is  known  that  the 
nominal  quasipolynomial  S(s;qo)  is  Hurwitz,  because  the  tuning  correlations  considered 
in  Table  7.1  yield  controller  settings  that  guarantee  the  stability  of  the  nominal  closed 
loop.  Let 

A(s):={S{s;q)\qeQ}  (7.8) 

denote  the  family  of  quasipolynomials  generated  by  (7.7)  for  all    qeQ,  where  the 

3 
uncertainty  domain  is  a  simply-connected  open  subspace  of  W.     consisting  of  vectors 

with  strictly  positive  elements.  The  robust  stability  of  the  closed-loop  with  respect  to  an 

uncertainty  domain  Q  is  ensured  if  and  only  if  the  entire  family  of  quasipolynomials 

A(s)  is  Hurwitz.     This  in  turn  can  be  ensured  through  the  zero-exclusion  principle 

enunciated  in  the  following  theorem. 

Lemma    7.1.       Given    the  parametric    uncertainty    q&Qand   the  family   of 

quasipolynomials    A{s)    of  constant  degree  whose  nominal  quasipolynomial 

8{s\qo)   is  Hurwitz,  then  every  element  of  A(s)   is  Hurwitz  if  and  only  if  the 

image  set  A(y'S)  excludes  the  origin  for  all  co>0. 

Proof    First  note  that  the  symbol  (b  is  used  to  denote  the  standard  frequency 

variable  measured  in  reciprocal  seconds  (in  the  sequel  we  introduce  the  dimensionless 

frequency  co  =  co00).  Also  note  that  the  degree  of  (7.7)  is  equal  to  2  for  all  q  g  Q  because 

the  coefficient  ccttqTj  of  the  monomial  arTQijs    is  always  nonzero  since  aT,  rQ,and 
Tj  =  tj(tq,0q/tq)  are  strictly  positive  parameters.    As  discussed  before,  £(s;qo)  is 
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Hurwitz.    Therefore,  from  the  continuity  of  the  roots  of  S(s;  q)   for  qe(2>  Lemma  1 
follows  from  the  application  of  the  Boundary  Crossing  Theorem  (Bhattacharyya,  1995) 
for  quasipolynomials,  specialized  to  the  Hurwitz  case  where  the  stability  region  of 
interest  is  the  open  left-half  plane.  Q.E.D. 

Applying  Lemma  7.1  to  the  system  being  considered  yields  the  following  stability 
result. 

Lemma  7.2.    The  closed-loop  system  of  Figure  7. 1  with  PI  control  parameters 
adjusted  using  a  tuning  correlation  from  Table  7.1  is  robustly  stable  with  respect 

T 

to  all  parametric  uncertainties  q  =  [a^  aT  clq\    eQ  if  and  only  if  the  inequality 


A(co,a0) 


a, 


a 


X  _ 


*b(co) 


(7.9a) 


holds  for  all  qeg  and  for  all  co>0,  where 


A(co,ae):  = 


K0KC 


(a    (.   \ 


( 


\ 


\T1  J 


ZQ>$>{aeaJ)  +  co  sm{ae(D) 


<eX 


KaKr 


cocos(a0co) — - 


f  -r      \ 


CO 


VToJ 


\TI  J 


sm{aeco) 


eWxz   (7.9b) 


b(co):  = 


0 
-co 


eW 


(7.9c) 


Proof  First  note  that  equations  (7.9a)-(7.9c)  are  in  terms  of  the  dimensionless 
frequency  co  defined  as  co:=co00  where  a>  is  the  standard  frequency  with  units  of 
reciprocal  seconds  and  0O  is  the  nominal  value  of  the  time  delay.  Therefore,  without  loss 
of  generality,  it  suffices  to  apply  Lemma  7.1  to  the  image  S(jco;q0)  =  5(jco/ 60;q0)  of 


the  quasipolynomials  (7.7)  i.e., 
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S(^;q)  =  aKK0Kc 
uo 


f 


JO) 


#0 


JC0.2  jco 

u0  u0 


or 

S(^\q)^aKKoKc(j^co  +  \)(cos(a0co)-jsm(a0co))-aT^Lco2+j^-co(l.\O) 
00  ^o  0q  0o 

Then  from  (7.10)  and  Lemma  7.1  it  follows  that  the  system  is  robustly  stable  with  respect 

to  Q  if  and  only  if 


aKK0Kc 


T  T   T  T 

j—co  +  \  (cos(a eco)  -  j  sin(a ea>))  -  a T    0Ico2+j  —  a>*0      (7.11) 


0O 


0i 


0. 


for  all  q  6  Q  and  all  co  >  0 ,  or  similarly  co  >  0  because  60  is  strictly  positive.  Separating 
the  real  and  imaginary  parts  and  factoring  out  the  parameter  perturbations  aK  and  aT 
gives  the  following  vector-matrix  form  of  inequality  (7.1 1): 


( 


kqKc 


——cos(a0co)  +  CO  Sm{a0CO) 

Tj    Tq 


( 


Ko^c 


/        \     To  0o   ■  t        \ 
cocos(a0co) — -— —  sm(a0co) 

tj  r0 


"\      fa  > 

#0 

-l 

2 

CO 

-             - 

-            — 

J 

lro; 

<*K 

^ 

0 

0 

az 

-co 

J 

which  is  the  same  as  (7.9)  completing  the  proof.  Q.E.D. 

The  net  effect  of  using  multiplicative  perturbations  and  the  dimensionless 
frequency  co  as  opposed  to  the  standard  frequency  co  is  that  the  expression  (7.9) 
explicitly  shows  the  role  of  the  ratio  60 1  r0 .  Note  that  00 1  r0  is  the  only  parameter 
needed  to  extract  the  value  of  the  factors  KqKc  and  r0  /  r ,  from  the  tuning  correlations 
of  Table  7.1;  hence,  the  product    (r0  /  Tj)(6Q  I  tQ)  is  not  further  simplified  in  (7.9b)- 


(7.9c).    Furthermore,  through  (7.9)  Lemma  7.2  completely  describes  the  robust  stability 
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characteristics  of  the  system.  Hence,  the  robust  stability  characteristics  of  the  system  are 
only  dependent  on  00 1  r0  and  the  particular  tuning  rule  selected. 

7.3.2.  Parametric  Boundaries  for  Robust  Stability 

The  theoretical  development  of  the  previous  section  is  valid  for  any  arbitrary 

uncertainty  domain  Q  that  is  a  simply-connected  open  subspace  of  9?  consisting  of 
strictly  positive  elements  and  that  contains  the  nominal  point  q0  =  [l,  1,  l] .   This  section 

seeks  to  characterize  the  largest  uncertainty  domain  for  which  the  closed-loop  system  is 
robustly  stable,  thereby  giving  the  region  of  all  stabilizable  multiplicative  perturbations  of 
the  nominal  parameters.  Such  a  region,  denoted  Qmax ,  is  fully  described  by  its  boundary 
<9£>max.  Obviously,  any  vector  q  for  which  ctK  =  0,  <xT  =  0,  or  ae  =  0  is  a  possible 
element  of  the  boundary  of  QmiX .  The  challenge  is  to  find  all  the  strictly  positive  vectors 
q  that  are  elements  of  dQmM.  First  note  that  for  every  strictly  positive  element 
q  =  [aKaT  a0]T  of  the  parametric  robust-stability  boundary,  the  characteristic  quasi- 
polynomial  S(s;q)  is  not  Hurwitz  and  the  family  of  quasipolynomials  produced  by 
Q  =  {q  +  Aq,  | Aq|  <  e]  contains  at  least  one  element  that  is  Hurwitz,  where  £  is  an 
arbitrary  positive  real  scalar  and  |»|  is  a  vector  norm.  This  fact  gives  rise  to  the  following 
theorem. 

Theorem  7.1.  If  the  strictly  positive  uncertainty  vector  q  =  [aKaT  ae]T  is  an 
element  of  the  parametric  robust-stability  boundary  3^max  for  the  closed-loop 
system  of  Figure  7.1  with  PI  control  parameters  adjusted  using  a  tuning 
correlation  from  Table  7.1,  then  q  must  satisfy  the  parametrized  map 
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-co 


a, 


KQKC 


cocos(adco) 


eJrA 


\Ti  J 


sin(an<y) 


(7.12a) 


(e  („  \ 


OLr  - 


V  ^° 


\Ti  J 


\ 


cos(aeco)  +  co  sm(a9co) 


6, 


cocos{aeco) 


6> 


(  -  \ 


\Ti  J 


si^a^) 


CO 


(7.12b) 


for  some  co>0. 

Proof.    By  definition  the  closed-loop  system  is  robustly  stable  with  respect  to 

Qmzx ,  and  for  any  q  =  [aKar  ae]T  e dQmix ,  the  set  QmiX  u q  is  not  robustly  stable.  From 

Lemma  7.2  this  implies  that  q  must  satisfy  the  equality  A(co,a0)[aK  ar]T  =  b(co)  for 

some  co  >0.    All  relevant  solutions  to  this  vector-matrix  equality  are  given  by  (7.12a)- 
(7.12b).    Further  details  of  the  proof  are  given  in  Appendix  C.  Q.E.D. 

Note  that  for  a  given  ratio  Oq  I  Tq  the  candidate  uncertainty-boundary  coordinates 
aK  and  aT  in  (7.12)  are  parametrized  in  terms  of  ae.  Equations  (7.12a)-(7.12b)  trace 
curves  in  the  ar  -a ^  space  as  the  frequency  varies  for  a  fixed  value  of  Oq  I  Tq  and  for 
an  arbitrarily  selected  value  ae  >  0 .  A  subset  of  these  curves  defines  the  boundary  set 
d£)max.  In  particular,  note  that  if  at  a  given  frequency  the  values  aK  and  aT  obtained 
from  (7.12)  are  not  simultaneously  positive,  then  they  are  not  elements  of  dQmax  because 
such  pairs  are  not  admissible.  Therefore,  it  can  be  anticipated  that  dQmax  can  be 
characterized  by  evaluating  (7.12)  within  a  set  of  selected  frequency  intervals  since  at 
some  frequencies  the  maps  yield  inadmissible  solutions.  The  subintervals  of  frequency 
that  yield  admissible  solutions  can  be  identified  by  finding  the  frequencies  at  which  aK 


and  aT  defined  in  (7.12)  are  simultaneously  positive. 
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Lemma  7.3.    Consider  all  parameters  aK  and  aT  given  by  (7.12)  of  Theorem 
7.1.   Then  aK  >  0  and  aT>0  if  and  only  if 


where 


co  eQ,  uQ2  uQ3u- 


Q,:=< 


(O,coj      if    \<l±^a9< 


00 


(.<t>d\>®nx) 


otherwise 


Q,:  = 


_     n 


(^3><0 


*1    *0 

otherwise 


Q3:  = 


T     0 


(®rf5.®»5) 


otherwise 


and  where  conX ,  con2  ,  •  •  •  are  the  positive  zeros  of 

t    9 
fn(G>):=——cos(aQCo)  +  cQsm(a0Co)  (7.13) 

Tj  r0 

arranged  in  increasing  order,  and  codx ,  cod2  ,  •  •  •  are  the  positive  zeros  of 

t    6 
fd(&):=  cocos(agco) -— sin(#0<y)  (7.14) 

Tj    Tq 


arranged  in  increasing  order. 

Proof.  A  comprehensive  proof  in  given  in  Appendix  D.  Q.E.D. 

Lemma  7.3  gives  the  frequency  intervals  for  which  the  values  of  aK  and  aT 

produced  by  the  map  (7.12)  are  simultaneously  positive.  For  each  interval,  the  parametric 
plots  of  aK  and  aT  trace  a  curve  in  the  aT  -aK  space.   From  Theorem  7.1,  the  points 
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on  these  curves  are  candidate  members  for  the  robust  stability  boundary.  Unfortunately, 
there  is  an  infinite  number  of  frequency  intervals  in  Q.  and  therefore  also  an  infinite 
number  of  curves.  It  is  not  possible  to  check  all  the  frequency  intervals  and 
corresponding  curves  to  determine  the  robust  stability  boundary.  Fortunately,  the 
following  theorem  states  that  the  first  frequency  interval  gives  the  curve  that  is  the  robust 
stability  boundary,  and  that  all  other  frequency  intervals  need  not  be  considered. 

t 

Theorem  7.2.    The  strictly  positive  uncertainty  vector  q  =  [ct^  aT  ag]     is  an 

element  of  the  parametric  robust-stability  boundary  dQmax  for  the  closed-loop 
system  of  Figure  7.1  with  PI  control  parameters  adjusted  using  a  tuning 
correlation  from  Table  7.1  if  and  only  if  q  satisfies  (7.12)  for  some  &>eQ,, 
where 


Q, 


(0,*>BI)      if    \<^a0< 


00 


Tl   To 


(a>d],con])  otherwise 

Proof  First,  Qm3X  is  the  region  that  contains  the  nominal  point  and  that  is 
bounded  by  the  curves  given  by  all  the  frequency  intervals  and  aK  =  0,  aT  =  0,  and 
ae  =  0.  Second,  note  that  from  (7.12)  and  (7.13)-(7.14)  the  endpoints  of  each  frequency 
interval  are  the  zeros  of  the  numerator  and  denominator  of  aT  such  that  aT  ranges  from 
zero  to  infinity  over  each  frequency  interval.  Also,  aT  is  monotonically  decreasing  over 

each  frequency  segment.     Therefore,  there  is  a  one-to-one  relationship  between  positive 
values  of  aT  and  frequency  in  each  interval.   That  is,  given  a  positive  value  of  aT  there 

is  one  and  only  one  corresponding  frequency  in  each  frequency  interval.    This  in  turn 
implies  that  for  any  positive  value  of  aT  there  is  a  unique  value  of  aK  for  each  frequency 
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interval.  Hence,  each  curve  divides  the  first  quadrant  of  the  aT-aK  plane  into  two 
regions.  For  ae  =  1 ,  the  lower  region  always  contains  the  nominal  point;  therefore,  Qnax 
is  the  open  set  for  which  ar  ranges  from  zero  to  infinity  and  aK  ranges  from  zero  to  the 
lowest  value  of  aK  given  by  (7.12)  over  all  frequency  intervals.  This  implies  that  the 
robust  stability-boundary  dQmaiX  is  the  curve  that  gives  the  lowest  value  of  aK  for  each 
value  of  ar.  Hence,  it  is  sufficient  to  show  that  the  curve  obtained  from  the  first 
frequency  range  Q,  of  Lemma  7.3  gives  the  smallest  value  of  aK  for  each  value  of  aT . 

Further  details  of  the  proof  are  given  in  Appendix  E.  Q.E.D. 

Theorem  7.2  defines  the  region  of  stable  parameter  perturbations  QmM  in  terms  of 

its  boundary  for  a  fixed  perturbation  in  time  delay,  a0.   For  the  nominal  value  a0  =  1 , 

the  above  results  give  the  stability  boundary  when  there  is  no  uncertainty  in  the  time 
delay.  To  characterize  the  effect  of  arbitrary  perturbations  in  the  time  delay,  the  above 
analysis  can  be  performed  over  a  range  of  values  a0  >  0 .     The  result  is  a  different 

stability  boundary  curve  for  each  value  of  a0,  and  hence  in  the  ax-ae-  aK  space,  the 

stability  boundary  becomes  a  surface.     As  such,  every  point  on  the  stability  boundary 

<T 

surface  q  =  [«^  aT  ccq]    must  satisfy 

aK  =f(ar,a0)  (7.15) 

where  f(aT,a0)  depends  implicitly  on  the  tuning  rule  adopted  from  Table  7.1  because  it 
features  the  product  K0KC  and  the  ratio  r0  / r,.  Given  a  pair  of  boundary  coordinates  aT 
and  a0,  the  mapping  (7.15)  if  found  as  follows.  First,  for  notational  simplicity  let 
aK  =  fK(co,a0)  and  aT  =  fT(co,a0)  respectively  represent  equations  (7.12a)  and  (7.12b) 
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whose  right-hand  sides  unambiguously  define  the  mappings  fK:(co,a0)->  a K  and 
fT:(a>,a0)  ->  aT .  Note  that  for  a  given  coordinate  a0  the  mapping  fT  defines  a  one-to- 
one  correspondence  between  ©eQ,  and  aT  <0.  This  implies  that  the  inverse  map 
f~K.(aT,ae)->  co  exists.  Next,  set  a0  =  a0  and  ar  =  aT  in  (7.12b)  to  define 
aT=  fT(co,a0)  and  solve  the  equation  to  obtain  co  -  f'1  (a T ,a 0)  with  <yeQ,  as 
prescribed  by  Theorem  7.2.  Finally,  substitute  the  frequency  value  into  (7.12a)  to  obtain 
aK  =  fK(f~\aT,a0),a0):=f(ar,a0) .  Hence,  the  map  (7.15)  is  readily  computable.  It 
may  be  of  interest  to  remark  that  in  general  f(aT,a0)  does  not  have  a  closed-form 
expression  because  f~l(aT,a0)  itself  does  not  have  a  closed- form  expression.  Extensive 
numerical  studies  have  shown,  however,  that  approximate  close-form  expressions  for 
/r_1  can  be  obtained  with  high  accuracy  through  least-squares  fit  to  simple  functional 
forms.  Nonetheless,  this  venue  is  not  pursued  further  in  this  paper.  Finally,  the  system  is 
stable  with  respect  to  the  arbitrary  multiplicative  uncertainty  q  =  [aKaT  a0]T   if  the 

uncertainty  satisfies 

aK<f(ar,a0)  (7.16) 

Using  this  expression  it  can  easily  be  determined  if  a  given  set  of  uncertainties 
destabilizes  the  system  for  a  specific  tuning  rule  and  its  associated  choice  of  tuning 
parameter.  Furthermore,  it  follows  that  a  complete  characterization  of  the  region  of 
stabilizable  multiplicative  perturbations  is  compactly  given  by 

0max:=  {q  =  [a*  a,  <**¥  --0<aK<  f(aT,a0), aT  >O,a0> o} 
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Clearly,  the  size  of  Qm3X  can  be  used  to  compare  different  tuning  rules.  Obviously,  tuning 
rules  with  larger  regions  of  stable  parameter  perturbations  are  more  robustly  stable.  To 
better  characterize  the  size  of  the  stability  region  the  next  section  introduces  an 
appropriate  parametric  stability  margin  along  with  the  classical  gain  and  phase  margins, 
and  discusses  their  relationship  to  the  robust  stability  boundary.  The  parametric  stability 
margin  is  then  used  to  characterize  the  robustness  of  the  controllers  for  quantitative 
purposes  when  considering  a  given  tuning  rule,  and  for  qualitative  purposes  when 
comparing  alternative  tuning  rules. 
7.3.4.  Stability  Margins 

The  parametric  stability  margin  is  defined  as  the  length  (in  a  vector-norm  sense) 
of  the  smallest  scaled  additive  perturbation  Aq  =  q  -  q0  that  destabilizes  the  closed-loop 
system  of  Figure  7.1.  This  margin  serves  as  a  quantitative  measure  of  the  robustness  of 
the  closed  system  with  respect  to  the  parametric  uncertainty  referenced  to  the  nominal 

point  q0  =[1,1,1]T,  and  is  useful  as  means  of  comparing  the  performance  of  proposed 
controller  tuning  rules.  The  value  of  the  parametric  stability  margin  depends  on  the  norm 
used  to  measure  the  length  of  the  smallest  destabilizing  perturbation  Aq .  In  this  paper 
the  lK  norm  is  adopted  because  it  represents  the  smallest  box  that  destabilizes  the  system 
and  that  contains  the  nominal  point  in  the  uncertain  parameter  space.  Note,  other  norms 
may  certainly  be  considered  (the  l2  represents  a  sphere  in  the  uncertain  parameter  space), 
but  the  /w  was  chosen  because  the  resulting  box  gives  an  easily  understood  bound  in 
terms  of  the  physical  parameters  of  the  system. 

The  /„  parametric  stability  margin  is  calculated  by  solving  the  minimization 
problem 
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Pkto(0o/to)'=       W*^\aK-l>aT-l>a6-tf\  (7-17) 

where  Q,  is  the  interval  of  frequencies  that  gives  the  robust  stability  region  as  prescribed 
by  Theorem  7.2,  and  aK  and  aT  are  given  by  (7.12)  in  Theorem  7.1.  Note  that  (7.17) 
explicitly  shows  the  dependence  on  the  ratio  <90/r0,  the  only  parameter  that  (7.12)  is 
truly  dependent  on  when  considering  a  specific  tuning  rule  chosen  from  Table  7.1.  This 
optimization  has  no  convexity  guarantees,  but  because  it  only  involves  two  parameters,  it 
can  easily  be  solved  by  an  exhaustive  numerical  search.  In  addition,  the  constraint  that 
ae>0   can  be  further  restricted.     To  show  this,  for  ag  =  1    (i.e.,  when  there  is  no 

perturbation  in  the  time  delay)  let 

Acr(0o /*"(>):=  mm  \[aK  -\,aT  -1]T| 

G)Q2,  »  llo° 

be  the  parametric  stability  margin  when  only  considering  uncertainty  in  the  process  gain 
and  in  the  process  time  constant.  This  margin  is  easily  calculated  by  performing  a 
frequency  sweep  over  the  specified  range.  The  minimization  (7.18)  need  only  be 
performed  over  the  range  l-pKT(00 1  t0)  <  ae  <  1  +  pKr(00 1  r0) ,  because  pKr(00 1  r0)  is 
an      upper     bound      on      (7.18),      and      for      every      ae      outside      this      range 

[aK  -l,aT- \,ae  -  1]T  |[  >  pKr(0o  I  r0) . 

The  parametric  stability  margin  (7.17)  considers  simultaneous  perturbations  in  all 
the  uncertain  parameters.  However  it  is  also  of  some  interest  to  use  the  result  of  Theorem 
7.1  and  Theorem  7.2  on  the  region  of  stable  parameter  perturbations  to  obtain  other 
measures  of  stability,  such  as  the  classical  gain  and  phase  margin  (Seborg,  1989),  as  well 
as  a  margin  defined  in  terms  of  perturbations  only  in  the  time  delay.  While  the 
parametric  stability  margin  (7.17)  is  clearly  a  superior  measure  of  robustness  in  its 


141 

consideration  of  simultaneous  variations  in  all  parameters,  a  comparison  of  the  results 
obtained  with  the  gain  and  phase  margins  is  very  instructive. 
First  consider  the  parametric  gain  margin  defined  as 

PK(ejT,):=f(\,\;OjTQ) 
where  the  dependence  on  the  ratio  60 1  t0  is  again  shown  explicitly.  Since  the  parametric 
gain  margin  is  the  smallest  multiplicative  perturbation  in  the  gain  that  destabilizes  the 
system  when  there  are  no  perturbations  in  the  process  time  constant  and  in  the  process 
time  delay,  it  becomes  readily  apparent  that  this  definition  is  equivalent  to  the  standard 
classical  gain  margin  GM  (Ogata,  1990),  namely 

GM(60/ T0)  =  pK(e0/r0)  (7.18) 

We  can  also  obtain  an  expression  for  the  classical  phase  margin  by  identifying  the 
gain  crossover  frequency.  This  is  done  by  first  defining  the  parametric  time  delay 
margin  as 

Pe(0o/To):=miJkae     sj-    f{\tae\6QlTQ)  =  \  (7.19) 

which  is  the  smallest  multiplicative  perturbation  in  the  time  delay  that  destabilizes  the 
system  when  there  are  no  perturbations  in  the  process  gain  and  in  the  process  time 
constant.  It  then  follows  that  there  exists  a  frequency  for  which  the  Nyquist  plot  of  the 
open-loop  response  of  the  system  with  the  perturbation  q  =  [1, 1,  ps(QQ  1 10)]  passes 
through  the  critical  point  -1+jO,  because  the  closed  loop  characteristic  quasipolynomial 
(7.7)  is  identically  zero.  At  this  frequency  the  magnitude  of  the  open-loop  response  of  the 
time-delay  perturbed  system  is  |-1  +  y'0|  =  1 .  The  magnitude  of  the  open-loop  response  of 

the  nominal  system  is  also  1  at  this  frequency,  because  the  open-loop  magnitude  response 
is  independent  of  perturbations  in  the  time-delay.   Therefore,  this  frequency  is  the  gain- 
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crossover  frequency.    Letting  co  (0o/tq)  denoted  dimensionless  frequency  solution  to 

(7.12)  when  q-[\,\, pe(0o  /  z0)]J  ,  then  the  standard  gain-crossover  frequency  is  given 
by  cbg  =  cog(0o  /tq)/0o.  Note  that  the  dimensionless  gain-crossover  frequency 
cog(0o 1  t0)  is  only  dependent  on  the  tuning  ratio  0O 1  r0  and  the  tuning  rule.  Whereas 
the  standard  gain-crossover  frequency  is  dependent  on  the  tuning  ratio  0OI r0 ,  0o  and  the 
tuning  rule.  The  standard  phase  margin  is  determined  as  follows.  First,  the  only 
difference  in  open  loop  response  of  the  time-delay  perturbed  system  and  the  open-loop 
response  of  the  nominal  system  occurs  in  the  exponential  term.  At  the  gain-crossover 
frequency  the  open  loop  response  of  the  time-delay  perturbed  system  is  -1  +  y'O.  By 
definition  (Ogata,  1990),  the  standard  phase  margin  PM  is  the  phase  shift  that  will  rotate 
the  open-loop  response  of  the  nominal  system  to  -1  +  y'O.  Therefore  at  the  gain- 
crossover  frequency  the  exponential  term  of  the  time-delay  perturbed  system  must  equal 
the  exponential  term  of  the  nominal  system  phase  shifted  by  PM.  That  is 

or 

-ao0ocog  =  -0Qa)g  -  PM 

This  gives  rise  to  the  following  definition  of  the  standard  phase  margin 

PM(0o  /  r0)  =  p0(0o  I  ro)cog(0o  I  r0)  -  cog(0o  I  r0)  (7.20) 

which  is  in  terms  of  the  dimensionless  gain-crossover  frequency  cog(0o  I  t0)  to  show  its 
sole  dependency  on  the  tuning  ratio  0O I  r0  and  obviously  the  tuning  rule. 

Because  the  classical  gain  and  phase  margins  consider  perturbations  in  only  one  of 
the  parameters  at  a  time,  they  may  not  capture  the  true  robust  stability  characteristics  of 
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the  system.  Consider  for  example  a  system  that  has  large  gain  and  large  phase  margins, 
but  becomes  unstable  when  there  are  relatively  small  (compared  to  the  gain  and  phase 
margin)  perturbations  in  both  the  process  gain  and  time  delay.  On  the  other  hand,  a 
system  with  a  poor  gain  or  phase  margin  can  safely  be  anticipated  to  have  poor  robust 
stability  characteristics.  Therefore,  the  gain  and  phase  margin  are  useful  in  helping  the 
designer  to  avoid  controllers  with  poor  robust  stability  characteristics,  but  they  may  not 
be  sufficient  to  ensure  stability  robustness.  The  next  section  presents  results  that  compare 
the  robustness  properties  of  various  classical  PI  controller  tuning  rules. 

7.4.  Results  of  Numerical  Studies 
7.4.1.  Region  of  Stable  Perturbations  for  the  ITAE  Regulation  Tuning  Rule 

Using  the  theoretical  development  of  the  previous  section  it  is  possible  to 
determine  the  region  of  stable  perturbations  of  the  process  parameters  for  the  ITAE 
regulation  tuning  rule.  The  other  tuning  rules  have  qualitatively  similar  results.  From 
Table  7.1,  the  ITAE  regulation  tuning  rule  gives 

KOKC=O.S59(0O/Toy°971 

and 

^  =  O.674(0o/ro)-0680 

For  the  purposes  of  illustrating  the  theory  proposed,  consider  nominal  process  parameters 
K0  =  1 ,  r0  =  1 ,  and  60  =  0.5 .  The  tuning-parameter  ratio  is  0O  /  r0  =  0.5 ,  giving 
K0KC  =  1.691  and  r0/r7  =  1.080.  First  consider  the  case  where  there  is  no  multiplicative 
perturbation  of  the  time  delay,  such  that  a0  =  1 .  Standard  calculations  show  that  the  gain 
margin  for  the  loop  is  GM  =  1.816  and  the  phase  margin  for  the  loop  is  PM  =  38.61 
degrees.   From  the  theoretical  developments,  the  robust  stability  boundary  is  determined 
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from  the  first  range  of  frequencies  for  which  aK  and  aT  are  simultaneously  positive. 

T     0 

Because  — -— -ae  =  0.540 <  1 ,  Theorem  7.2  states  that  this  range  is  given  by  (codl,con]), 
where  cod]  is  the  first  positive  zero  of  the  function  fd(co),  and  con]  is  the  first  positive 
zero  of  the  function  fn{co)  given  in  Lemma  7.2.  From  (F.2),  codx  is  in  the  range  (0,— ;r) 

and  con]  is  in  the  range  (—  n,n).    A  simple  numerical  search  in  these  ranges  gives 

codl  -  1.122  and  con]  =  2.961 .   Now  equations  (I2a)  and  (I2b)  can  be  evaluated  over  the 

frequency  interval   co  e  (1.122,2.961)    while  keeping   60 1  r0  =  0.5   and   a0  =  1 .     From 

Theorem  7.2,  the  resulting  perturbation  pairs  are  all  the  boundary-point  pairs  (aK,aT) 

with  ag  =  1  and  are  plotted  in  Figure  7.2. 

The  curve  in  Figure  7.2  represents  the  robust  stability  boundary.  All  points  below 
the  curve  define  the  region  of  stable  perturbations.  The  '+'  in  Figure  7.2  represents  the 
nominal  point  aK  =  1 ,  aT-\,  and  ae  -  1 .   Obviously,  the  nominal  point  is  part  of  the 

region  of  stable  perturbations.  The  figure  shows  that  from  the  nominal  point  it  is  possible 
to  decrease  towards  zero  the  multiplicative  perturbation  in  the  process  gain  aK,  or 
arbitrarily  increase  the  multiplicative  perturbation  in  the  process  time  constant  aT,  while 
the  system  remains  closed-loop  stable.  On  the  other  hand,  if  aK  is  sufficiently  increased, 
or  aT  sufficiently,  decreased  the  closed-loop  system  becomes  unstable.   These  points  of 

critical  instability  are  given  by  the  intersection  of  the  stability  boundary  with  a  horizontal 
or  vertical  line  passing  through  the  nominal  point,  respectively,  with  the  former  being  the 
standard  gain  margin  (7.18),  which  from  Figure  7.2  has  a  value  of  GM  =  1.816  as 
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expected.  For  any  positive  values  of  the  multiplicative  perturbations  aK  and  aT  Figure 
7.2  shows  whether  the  ITAE  regulation  tuning  rule  with  0O I  r0  =  0.5  and  ae  =  1  is 
stable;  the  general  trend  is  that  an  increase  in  aT  results  in  an  increase  in  the  range  of 
stable  values  of  a  r . 
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Figure  7.2.    The  robust  stability  boundary  for  the  ITAE  regulation  tuning 
rule  when    0QI  t0  =  0.5    and    ae  =  1 .      The  '+'   marker  represents  the 

nominal  point  aK  =  \ ,  aT  =  1 ,  and  ae  =  1 .    The  '*'  marker  denotes  the 

gain  margin  pK(0.5)  =  GM(0.5)  =  1.816 . 

To  include  perturbations  in  the  time  delay,  the  stability  boundary  is  now  obtained 

for  a  range  of  values  of  ae  (with  the  tuning  ratio  parameter  remaining  a  constant  value  of 

6Q/  t0  =  0.5).  Evaluating  (7.12a)  and  (7.12b)  over  the  appropriate  frequency  intervals  as 

determined  in  Theorem  7.2  for  values  of  ae  =  0.5,  1.0,  1.5,  1.781,  and  2.5  yields  the 

parametric  plot  of  Figure  7.3. 
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Figure  7.3.  The  robust  stability  boundaries  for  the  ITAE  regulation  tuning 
rule  for  0O/  r0  =  0.5  and  selected  values  of  ae . 

The  figure  shows  that  as  ae  increases,  the  region  of  stable  perturbations  becomes  smaller 

{i.e.,  for  a  given  value  of  the  perturbation  in  the  process  time-constant  ar  the  range  of 

stable  perturbations  in  process  gain    aK    decreases).      The  reverse  is  also  true  for 

decreasing    ae   values.     The  inclusion  of  perturbations  in  the  time  delay  is  better 

illustrated  in  Figure  7.4  where  the  results  obtained  for  varying  a6  are  plotted  on  a 

contour  plot  where  the  x-axis  is  aT,  the  y-axis  is  a0,  and  the  level  curves  are  constant 

values  of  the  stability-boundary  surface  (7.15).    For  example,  consider  a  multiplicative 
perturbation  in  the  process  time  constant  of  ax  =  1.5  and  a  multiplicative  perturbation  in 

the  process  time  delay  of  ae  =  1.5 .    The  contour  plot  shows  that  /(1.5, 1.5)  =  1.6  at  this 

point,  implying  that  the  system  remains  stable  as  long  as  the  multiplicative  perturbation  in 
the  process  gain  aK  is  less  than  1.6  as  suggested  by  (7.16).   The  gain  margin  (7.18)  is 
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given  by  the  contour  curve  passing  through  the  '+'  marker  that  represents  the  point 
aT  =  1  and  ae  =  1 ,  again  giving  pK(0.5)  =  GM(0.5)  =  1.816 . 
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Figure  7.4.  The  contour  plot  of  the  stability-boundary  surface  for  the 
ITAE  regulation  tuning  rule  when  0O/  t0  =  0.5 .  The  '*'  marker  shows  the 
parametric  time  delay  margin  p^(0.5)  =  1.781 . 

Figure  7.4  also  shows  that  the  f(aT,a0)  =  1  contour  intersects  the  vertical  line  aT  =  1  at 

ae  -  1.781  consistent  with  Figure  7.3.    Therefore,  the  parametric  time-delay  margin  is 

pe (0.5)  =  1.781 ,  because  by  the  definition  (7.19)  this  is  the  smallest  value  of  ae  at  which 

/(l,  ae)  =  1 .    This  means  that  when  only  considering  perturbations  in  the  process  time 

delay,  a  multiplicative  perturbation  greater  than  or  equal  to  1.781  destabilizes  the  closed- 
loop  system.   In  addition,  the  frequency  solution  to  (7.12)  at  which  aK  =  1 ,  ax  =  1 ,  and 

a0  —  pB (05)  =  1.781    is    the    gain-crossover    frequency,    which    for   this    example   is 
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co g(0.5)  =  0.8629  .   Using  equation  (7.20)  the  standard  phase  margin  is  determined  to  be 

PM  =  38.61  degrees,  which  agrees  with  the  previously  calculations. 

Finally,  using  (7.17)  the  parametric  stability  margin  is   pKr0  =  0.2395 .     This 

represents  the  smallest  additive  perturbation  in  the  nominal  parameters  that  will 
destabilize  the  system.  In  fact,  the  perturbed  system  given  by  K  =  (1  +  pKr0)Ko  -  1.2395 , 

T  =  (l-pKT0)ro  =0.7605,  and  9  =  (\  +  pKT0)6o  =0.6198,  is  critically  stable.     This  is 

represented   in   Figure   7.4,    as   the   intersection   of   aT  =0.7605,    a0- 1.2395,   and 

f(aT,a0)  =  01.2395.    The  utility  of  the  parametric  stability  margin  is  now  apparent. 

Consider  the  gain  margin,  GM  =  1.816,  and  phase  margin,  PM  =  38.61  degrees,  both  are 
within  the  recommend  range  for  a  well-tuned  controller  (i.e.,  1.7  <  GM  <  2.0  and 
30°  <  PM  <  45°  (Seborg,  1989)).  On  the  other  hand,  the  parametric  stability  margin, 
pKz0  =  0.2395,  reveals  that  in  fact  only  a  relatively  small  additive  perturbation  of  the 

three  process  parameters  results  in  an  unstable  closed-loop.  Therefore,  this  example 
demonstrates  the  observation  suggested  earlier  about  the  gain  and  phase  margin.  Namely, 
that  a  good  gain  margin  and  a  good  phase  margin  are  necessary  for  robustness,  but  are 
not  sufficient.  To  ensure  robust  stability,  even  for  this  relative  simple  system  of  a  PI 
controller  acting  on  a  first-order-plus-time-delay  system,  it  is  necessary  to  consider  a 
margin  that  considers  uncertainties  in  all  the  process  parameters,  such  as  the  parametric 
stability  margin. 
7.4.2.  Stability  Margins  Computation  for  Each  Tuning  Rule 

The  results  of  the  previous  section  are  specialized  to  only  one  tuning  rule  and  for  a 
specific  value  of  the  tuning-parameter  ratio,  namely  the  ITAE  regulation  tuning  rule  and 
6QI  r0  =  0.5 .    These  results  can  be  repeated  for  other  tuning  rules  or  other  values  of 
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0O I  t0  giving  qualitatively  similar  results.  It  is  also  possible  to  easily  compare 
quantitatively  the  various  tuning  rules  over  a  range  of  tuning  parameters.  The  following 
figures  show  the  value  for  the  gain  margin,  phase  margin,  and  parametric  stability  margin 
for  all  the  tuning  rules  being  considered,  over  the  accepted  range  of  the  tuning-parameter 
ratio,  namely  0.1  <  <90  /  r0  <  1.0 . 

Controller  manufactures  recommend  that  a  well-tuned  controller  have  a  gain 
margin  between  1.7  and  2.0  (Seborg,  1989).  Figure  7.5  shows  that  only  the  ITAE 
regulation  (ITAE-reg.)  tuning  rule  satisfies  this  recommendation  over  the  recommended 
range  of  the  tuning  parameter  00/  r0 . 


Figure  7.5.  Gain  margin  of  the  tuning  rules  vs.  the  tuning  parameter. 
The  Ziegler-Nichols  (ZN),  Cohen-Coon  (CC),  and  IAE  regulation  (IAE-reg.)  tuning  rules 
are  the  next  closest  to  this  recommendation  having  gain  margins  in  the  range  1.45  to  2.25. 
The  figure  also  shows  that  the  ISE  regulation  (ISE-reg.)  tuning  rule  has  poor  stability 
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characteristics,  with  a  gain  margin  always  less  than  1.5.  The  ITAE  servo  (ITAE-servo) 
and  IAE  servo  (IAE-servo)  tuning  rules  can  be  considered  highly  conservative  with  gain 
margins  always  greater  the  2.25.  For  the  ZN,  CC,  ITAE-reg.,  IAE-reg.,  and  ISE-reg. 
tuning  rules  the  general  trend  is  that  the  gain  margin  increases  with  increasing  values  of 
the  ratio  60 1  t0  .  This  implies  that  the  tuning  rules  yield  inherently  more  conservative 
controllers  as  the  time-delay-to-time-constant  ratio  increases.  This  trend  is  appropriate 
and  reasonable  considering  the  difficulties  usually  associated  with  processes  having  large 
time-delays.  As  for  the  ITAE-servo  and  IAE-servo  the  gain  margin  decreases  as  the  time- 
delay-to-time-constant  ratio  increases,  but  this  is  not  considered  a  problem  given  that  the 
tuning  rules  are  already  considered  to  be  overly  conservative. 
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Figure  7.6.  Phase  margin  of  the  tuning  rules  vs.  the  tuning  parameter. 
Controller  manufacturers  recommend  that  a  well-tuned  controller  have  a  phase 
margin  between  30  and  45  degrees  (Seborg,  1989).    Figure  7.6  shows  that  none  of  the 
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tuning  rules  satisfy  this  recommendation  over  the  range  of  90 1  r0  considered.  The  CC, 
ITAE-reg.,  and  IAE-reg.  can  be  considered  the  best  tuned  because  the  phase  margin  stays 
within  the  range  20  to  65  degrees.  For  values  of  60 1  r0  below  0.5,  the  ISE-reg.  tuning 
rule  lacks  robustness  because  the  phase  margin  is  less  the  20  degrees.  Again,  for  the  ZN, 
CC,  ITAE-reg.,  IAE-reg.,  and  ISE-reg.  tuning  rules  the  general  trend  is  that  the  phase 
margin  increases  with  increasing  00/  r0 ,  with  the  ZN  tuning  rule  showing  a  dramatic 
increase  in  phase  margin.  This  again  implies  that  the  tuning  rules  result  in  inherently 
more  conservative  controllers  as  the  time-delay-to-time-constant  ratio  increases.  For  the 
ITAE-servo  and  LAE-servo  the  phase  margin  remains  relatively  constant  as  the  time- 
delay-to-time-constant  ratio  increases,  staying  in  the  range  55  to  65  degrees.  The 
parametric  time  delay  margin  (7.19)  is  shown  in  Figure  7.7. 
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Figure  7.7.      Time  delay  margin  of  the  tuning  rules  vs.   the  tuning 
parameter. 
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A  comparison  of  Figures  7.6  and  7.7  reveals  the  expected  connection  between  the  phase 
margin  and  the  parametric  time  delay  margin  based  on  the  similarity  in  the  trends  of  the 
results. 

Finally,  the  parametric  stability  margin  is  shown  in  Figure  7.8.  This  margin  is  the 
most  representative  measure  of  robust  stability  because  it  considers  simultaneous 
uncertainties  in  all  the  parameters. 


Figure  7.8.   Parametric  stability  margin  of  the  tuning  rules  vs.  the  tuning 
parameter. 

For  higher  values  of  60 1  r0 ,  the  figure  shows  that  the  order  of  increasing  robustness  for 
the  tuning  rules  is  ISE-reg.,  CC,  IAE-reg.,  ITAE-reg.,  ZN,  IAE-servo,  and  ITAE-servo. 
For  lower  values  of  00 1  r0 ,  the  CC  and  IAE-reg.  tuning  rules  change  their  order  of 
robustness.  The  ZN  and  ITAE-reg.  tuning  rules  also  change  their  order  of  robustness  for 
lower  values  of  00 1  r0 .   The  figure  is  interpreted  as  follows.   For  00 1  r0  =  0.1  the  ISE- 
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reg.  parametric  stability  margin  is  pKx6  -  0.07 .  This  implies  that  a  7%  additive 
perturbation  of  the  nominal  process  parameters  causes  an  unstable  closed-loop  system.  In 
fact,  for  any  values  of  K0,  r0,  and  a  value  6Q  satisfying  00 1  r0  =  0.1 ,  a  controller 
designed  using  the  ISE  regulation  tuning  rules  is  destabilized  by  the  process  values 
K  =  \.07K0 ,  r  =  0.93r0 ,  and  0  =  1.07 0Q .  The  ISE-reg.  tuning  rule  is  not  very  robust, 
because  the  nominal  values  of  K0,  r0,  and  0O  are  only  estimates  for  which  the 
approximation  error  could  easily  be  greater  that  0.07.  As  for  the  robustness  of  the  other 
tuning  rules,  the  control  engineer  could  take  advantage  of  available  estimates  of  modeling 
error  bounds  and  use  Figure  7.8  as  a  guide  in  choosing  a  sufficiently  robust  controller 
tuning  rule. 

7.5.  Conclusions 
The  robustness  of  a  very  popular  class  of  PI  controller  tuning  rules  used  in 
conjunction  with  first-order  plus  time-delay  models  of  industrial  processes  is  of 
substantial  ongoing  interest.  This  paper  provides  a  mechanism  for  rigorously  evaluating 
the  robustness  of  various  tuning  rules  with  respect  to  variations  in  the  gain,  delay,  and 
time-constant  of  the  system  model.  An  application  of  the  zero-exclusion  principle  leads  to 
a  characterization  of  the  region  of  stable  perturbations,  which  is  then  used  to  generate 
analytical  and  graphical  tests  for  robust  stability.  The  parametric  stability  margin  and 
classical  gain  and  phase  margins  were  also  computed  based  on  the  region  of  stable 
perturbations  and  used  to  study  the  comparative  robustness  merits  of  various  PI  tuning 
rules.  This  rigorous  analysis  serves  to  confirm  some  commonly  held  views  on  the  merits 
of  particular  tuning  rules,  but  also  highlights  the  pronounced  lack  of  stability  robustness 
in  rules  such  as  the  ISE  regulation  tuning  rule. 


CHAPTER  8 
CONCLUSIONS  AND  FUTURE  WORK 

8.1.  Conclusions 

This  dissertation  presents  an  in  depth  exploration  of  robust  stability  analysis 
methods  for  systems  with  structured  and  parametric  uncertainties. 

Chapter  2  through  Chapter  4  investigates  the  Major  Principal  Direction  Alignment 
(MPDA)  property.  Chapter  2  gives  a  revised  statement  of  the  MPDA  that  fully  considers 
the  case  of  repeated  maximum  singular  values.  In  addition,  a  new  proof  is  presented  that 
makes  use  of  dual  norm  and  dual  vector  theory.  Chapter  3  studies  the  optimization 
problem  that  is  an  upper  bound  on  the  structured  singular  value,  /j,  .  In  particular,  when 
the  maximum  singular  value  is  repeated  the  objective  function  is  non-differentiable  {i.e., 
the  gradient  does  not  exist).  This  work  presents  a  characterization  of  the 
subdifferentiable  {i.e.,  the  set  of  all  sub-gradients  or  generalized  gradients)  that  can  be 
used  to  obtain  the  steepest  descent  direction  and  necessary  and  sufficient  conditions  for  a 
minimum.  Specifically,  for  the  case  of  a  twice  repeated  maximum  singular  value  the 
subdifferentiable  is  shown  to  be  a  3-dimensional  ellipsoid  in  an  n  -dimensional  space. 
Finally,  attainability  of  MPDA  (which  eliminates  conservatism  in  the  upper  bound  of  jj.  ) 
is  shown  to  be  equivalent  to  the  optimal  point  lying  on  the  surface  of  the  ellipsoid. 
Chapter  4  gives  a  necessary  condition  for  and  optimum  that  requires  the  optimal  point  to 
be  an  element  of  the  null  space  of  a  matrix  formed  from  the  elements  of  the  left  and  right 
eigenvectors  of  the  system  matrix. 
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Chapter  5  gives  an  extension  of  the  critical  direction  theory  to  systems  with  non- 
convex  value  sets  through  the  introduction  a  general  definition  of  the  Nyquist  robust 
stability  margin  the  preserves  the  earlier  results.  Chapter  6  gives  a  parameter  space 
method  for  determined  a  Proportional-Integral  (PI)  controller  that  has  guaranteed  stability 
properties  based  on  the  Nyquist  robust  stability  margin  while  optimizing  integral  time 
error  performance  objectives. 

Lastly,  Chapter  7  gives  an  extensive  stability  analysis  of  classical  PI  controller 
tuning  rules  based  approximate  first-order-plus-time-delay  models.  The  results  give  the 
region  of  all  stable  perturbations  in  the  model  parameters  as  well  as  plots  of  the  gain  and 
phase  margin  and  parametric  stability  margin  as  a  function  of  the  tuning  parameter. 

8.2.  Future  Work 

One  obvious  area  of  future  work  is  applying  the  stability  analysis  results  of 
Chapter  7  to  Proportional-Integral-Derivative  (PID)  controller  tuning  rules.  Another 
research  interest  of  author  is  stability  analysis  of  the  predictive  controllers,  and  work  is 
currently  proceeding  on  designing  robust  predictive  controllers  by  determining  the 
parametric  stability  margin  as  a  function  of  the  tuning  parameters  of  the  predictive 
controller,  namely  the  prediction  horizon,  control  horizon,  and  weighing  on  the  input. 


APPENDIX  A 
PROOF  OF  LEMMA  2.1 

Proof.  The  following  proof  is  modification  of  the  proof  given  by  Bauer  (1962)  in 

that  it  is  specialized  to  the  case  of  the  Euclidean  norm,  and  is  therefore  in  terms  of  the 

maximum  singular  value  of  the  matrix.    Excluding  the  trivial  case  A,(A)  =  0,  assume 

without  loss  of  generality  that 

o:(A)  =  A,(A)>0 

Let  y.  be  any  normalized  right  eigenvector  of  A  with  eigenvalue  A,  (A) ,  and  choose  w(. 
such  that 


w.-IIdv/ 


which  for  the  Euclidean  norm  implies  uniquely  that   w(.  =  v(.  /OvJ   =  vf . .     From  the 

definition  of  dual  vectors 

w>/  =  1 
Multiplying  by  A,(A)  gives 

A,.(A)w;v,.=A,.(A) 
Using  the  definition  of  eigenvector  gives 

w;Av,=A,(A)  (A.l) 

And  from  the  assumption 

w*Av(.  =  a(A) 
The  Holder  inequality  implies 


w(.Avf.  < 


A  w,   v,||  = 


A  w; 
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combining  the  previous  two  equations  gives 

<r(A)  <  A*wJ 

The  maximum  singular  values  of  A  and  A*  coincide,  therefore  by  definition 


cr(A)  =  <t(A  )  =  max  A  z| 

||z||=l  II 


Inequality  (A.2)  implies  that  w(.  maximizes  definition  (A.3)  such  that 


(A.2) 


which  from  the  assumption  gives 


or  equivalently 


ff(A)  = 


A,(A)  = 


A  w. 


A  w.. 


A  w. 


A,.(A) 


Combing  (A.l)  and  (A.4)  with  llvJI  =  1  gives 


=  1 


w(.A 


v..  = 


A  w; 


A,  (A) 


V;        =1 


Meaning 


A  w. 


A,.(A) 
But  the  dual  of  v;.  is  uniquely  determined  to  be  w . ,  therefore 

A*w, 


A,  (A) 


-L-  =  w; 


or 


w*A  =  /l,(A)w* 
Such  that  w.  is  left  eigenvector  of  A  and  is  dual  to  v(. 


(A.3) 


(A.4) 


Q.E.D. 


APPENDIX  B 
PROOF  OF  THEOREM  2.1 

Proof.  The  following  proof  is  taken  verbatim  from  (Kouvaritakis,  1985)  and 
consists  of  two  cases,  namely,  when  the  maximum  singular  value  is  distinct,  and  when  it 
is  repeated. 

(i)  Distinct  Maximum  Singular  Values: 

The  principal  directions  of  A  are  unique  (with  respect  to  each  other)  to  within  a 
scaling  factor  e]6 ,  so  that  alignment  of  the  major  input  and  major  output  principal 
directions  of  A ,  implies 

y*=eJ%  (B.1) 

Pre-multiplication  of  equation  (B.l)  by  A  gives 

AyA  =  a{A)xA=e}0MA 


or 


i=    -„-&■=, 


AxA=e'J°a(A)xA 

so  that  e~jecr  emerges  as  an  eigenvalue  X ,  of  A .  Noting  that  the  moduli  of  the 
eigenvalues  of  A  are  always  bound  from  above  by  g(A),  it  follows  that 
\A\  =  p(A)  =  a(A). 

To  prove  the  converse,  assume  that  p(A)  =  ~(f(A)  so  that  there  exists  a  vector  w 
such  that 

Aw  =  ejvG{A)w  (B.2) 

and 
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w*A*  =  e~Jv,a(A)w*  (B.3) 

Multiplying  equation  (B.3)  on  the  right  by  equation  (B.2),  and  introducing  the  singular 
value  decomposition  of  A ,  we  derive 

w*A*Aw     w*Y(A)X2(A)Y\A)w 


w  w  w*Y*(A)Y(A)w 

which  for  z  =  Y*(A)w  implies 

zY}{A)z 


o\A) 


z  z 


=  g\A) 


If  we  assume,  without  loss  of  generality  that  w  is  normalized  then  this  last  equality  can 
only  be  attained  for  z  =  eJ0ex ,  where  e,  is  the  first  standard  basis  vector,  with  a  '  1 '  in  the 
first  postion  and  O's  everywhere  else.  Thus  w  =  Y(A)z  =  Y(A)eJ0el  =  eJ0yA,  which  is 
next  substituted  in  equation  (B.2)  to  give 

Aej0yA  =  a(A)ej0xA  =  ei¥a{A)ej0yA 
or 

*a  =  eiy/yA 

and  this  completes  the  proof. 

(ii)  Repeated  Maximum  Singular  Value: 

A  simple  modification  of  the  above  arguments,  applied  to  the  subspace  spanned 
by  the  principal  directions  associated  with  the  repeated  singular  value,  caters  for  the 
general  case.  Let  the  maximum  singular  value  be  repeated  with  multiplicity  q ,  and  let 
xt . ,  y, ,  1  <  i  <  q ,  denoted  the  output  and  input  principal  directions  associated  with  the  q 
repeated  singular  values  cr  =  ~cr(A) .  Then  the  major  principal  directions  of  A  will  no 
longer  be  unique,  but  will  be  given  by 
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1=1 


9 

JC  = 

1=1 


where 


i=i 


The  proof  of  sufficiency  remains   exactly  the   same.      To  prove  necessity, 
proceeding  as  before  we  obtain  from  equations  (B.2)  and  (B.3) 

z'Z\A)z  =_2{A)  (B4) 

z  z 

for 

z  =  Y\A)w  (B.5) 

Since  the  first  q  elements  of  the  diagonal  matrix  'Li  A)  are  all  equal  to  ~<7{A) ,  if  follows 
from  equation  (4)  that  for  a  normalized  w ,  z  must  now  assume  the  form 

z  =  [ax,    a2,    ■•-,    aq,    0,    •••,    o] 

with 


i=i 
Back  substitution  of  w  =  Y(A)z  into  equation  (B.2)  reveals  that 

9  1 

1=1  1=1 

and  this  completes  the  proof.  Q.E.D. 


APPENDIX  C 
PROOF  OF  THEOREM  7.1 

Proof.  By  definition  the  closed-loop  system  is  robustly  stable  with  respect  to 
gmax ,  and  for  any  q  =  [aK  aT  a/  e  5£>max ,  the  set  Qmm  u  q  is  not  robustly  stable.  From 
Lemma  7.2  this  implies  that  q  must  satisfy  the  equality 


A(o);0o/ro;ae) 


a, 


a. 


b(ffl) 


(C.l) 


for  some  co  >0.    The  solutions  \aK     ar]     of  (C.l)  at  each  frequency  depends  on  the 

rank  of  the  matrix  A(co;60 1  T0;ae) ,  and  only  solutions  that  satisfy  aK  >  0 ,  aT  >  0 ,  and 

ae  >  0  are  considered  as  admissible  because  QmM  is  the  space  of  strictly-positive  real 

ordered  triplets.  First,  at  those  frequencies  where  matrix  A  is  full  rank,  the  solutions  for 
aK  and  aT  are  given  by  (7.12a)-(7.12b),  as  claimed  in  the  theorem.   It  now  suffices  to 

exclude  from  the  admissible  set  all  the  solutions  corresponding  to  frequencies  where  A 
is  rank  deficient. 

Matrix  A  is  not  of  full  rank  whendet(A)  =  0,  or  equivalently  when 


KK 


rn  9, 


co  cos{aeco)  — -  — -  sin(«0<y) 


Vft  V1 


*"/  ^o 


\roJ 


6)=0 


Since  the  values  of  K0KC  and  0Q I  zQ  are  nonzero,  A  is  not  full  rank  when  either  co  =  0 
or  at  frequencies  where 


cocos(aeco) — -— -sin(aeco)  =  0 


(C.2) 


*"/  To 
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As  expected,  when  co  =  0  the  equation  associated  with  the  imaginary  part  of  the  image  of 
the  uncertainty  drops  out,  because  the  image  of  the  uncertainty  is  a  real  number.  The 
solution  for  the  case  co  -  0  is  therefore  obtained  from  the  first  equation  of  (C.l),  namely, 


!  /:_I±2±  o 


"0"c 


^ I    ^0 


a , 


ex. 


=  0 


and  since  K0KC,  r0  /  vn  and  60/t0  are  non-zero,  the  solution  set  is  given  by  setting 
aK  =  0  and  aT  equal  to  an  arbitrary  real  number.  This  solution  set  is  not  admissible 
because  aK  =0.  At  the  non-zero  frequency  solutions  to  (C.2)  the  second  equation  of 
(C.l)  becomes  inconsistent,  and  therefore  there  is  no  solution  aK  and  ar  to  (C.l)  at 

these  frequencies.  In  terms  of  the  solutions  given  by  (7.12),  these  are  frequencies  at 
which  the  expressions  for  aK   and  for  aT   tend  to  infinity.     Therefore,  all  relevant 


solutions  to  (C.l)  are  given  by  (7.12). 


Q.E.D. 


APPENDIX  D 
PROOF  OF  LEMMA  7.3 

Proof.  First  the  frequencies  that  give  aK  >  0  are  determined,  followed  by  the 
frequencies  that  give  aT  >0.  The  intersection  of  these  two  sets  of  frequencies  is  the 
desired  set.  Starting  with  co  =  0,  the  solution  is  aK  =  0  and  aT  arbitrary,  and  therefore 
co  =  0  is  excluded  from  the  range  of  frequencies  that  yield  aK  >  0  and  aT  >  0 .  After 
using  L'Hopital's  rule,  the  limit  as  co  -»  0+  of  the  solution  (7.12a)  is 

lim  aK=  lim 7 r- 

<»-s.0+  <u->.0+  (  T     6 

cos(aeco)  -  coa0  sin(adco)  — -  — -  ae  zQi>{aeco) 


K,KC 


V  T  i  To 


-1 


V        Ti  To        J 


so  that  when  1  <  — -— -ae  <oo,  because  K0KC  is  always  positive  in  the  set  of  tuning  rules 


7/  r0 


considered.  When     -    °  ae  -  1 

lim  a^  =  lim ^ r  =  +oo 

^^o*  <o-+o*  KoKc[-coa0sm(a0co)) 

Therefore  lim  aK  >  0  when  1  <  — -— -a6  <  <x> ,  and  lim  aK  <  0  when  ——a0  <  1 .  The 

next  step  is  to  determine  the  positive  frequencies  at  which  the  sign  of  aK  changes.  As 
discussed  Appendix  F.,  for  positive  frequencies  the  sign  of  the  numerator  of  (7.12a)  never 
changes  and  the  sign  of  the  denominator  of  (7.12a)  changes  at  codl,cod2,--- .   Therefore, 
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codx,cod2,---  are  the  frequencies  at  which  the  sign  of  aK  changes.    Starting  at  *y  =  0+, 


T      0 

aK    is  positive  when    1  <— -— -ae  <oo,   and  as  the  frequency  increases  each    codi 


*1   To 


represents  a  frequency  at  which   aK   changes  between  positive  and  negative  infinity. 


Therefore,  when  1  <  — -— -a0  <<n ,  aK  is  positive  for 


TI    To 


^^^K,a'^(^COd\)^>(C°d2^d3)KJ(0)d4^d5)^>- 


For    °    °ag<l,  aK  is  negative  at  co  =  0+  and  again  the  sign  changes  over  at  each  point 


*1    *0 


codj  as  the  frequency  increases.  Therefore,  when  — -— -ae  <  1 ,  aK  is  positive  for 


*l  To 


These  frequency  ranges  for  which  or^  is  positive  exclude  the  end  points  codi ,  because  at 
these  frequencies  the  denominator  of  (7.12a)  is  zero  and  aK  becomes  discontinuous. 
To  determine  the  frequency  ranges  for  which  aT  is  positive  first  consider 


Tn0. 


lim  aT  =  lim 

cy->0+  <a-»0+  I 


0    ^0 


cos(a0co)  +  co  sin(aeco) 


YeA 


\To  J 


^y 


KToJ 


Tn    0n       . 


co  cos(a  eco)  — -— -  sm{aeco) 


*i  ^ 


o* 


=  ±oo 


CO 


where  the  sign  of  the  denominator  0*  is  determined  from 


..        cocos(cxaco)        ,.     cos(a„co)  -  coaa  sm(a,co) 
hm —     e         =  hm  — —^ — 

—  —  sm(adco)  —  —  aecos{aeco) 


*I    To 


*I  ^o 


I±f?± 


a, 


so  that 
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lim 

<o-»0+ 


co  cos{aeco)  — -— sm(aeco) 
*i  To 


0"  when  ^^ae>\ 
0  when  h-i°-ae  =  \ 
0+     when    ^tkae<\ 

TI   ?0 


T     6  T     0 

Therefore  lim  ar  =  +co  ,  when  1  <  — -— -ae  <  oo ,  and  lim  ax  =  -oo  when  -e~ -ae  <  1 


<u->0+ 


T,   To 


<u-»0 


T,   T0 


The  next  step  is  to  determine  the  positive  frequencies  for  which  the  sign  of  aT 

changes.  As  discussed  in  Appendix  F,  when  considering  only  positive  frequencies  the 
sign  of  the  numerator  of  (7.12b)  changes  at  the  frequencies  conX,  a>n2,  ••• ,  and  the  sign  of 
the  denominator  of  (7. 12b)  changes  at  the  frequencies  codl ,  cod2 ,  •  •  • .  From  equation  (F.2) 
of  Appendix  F,  coni  *  codj  for  all  i  =  1, 2,  •  •  • ,  and  j  —  1, 2,  •  •  • ;  therefore,  conX ,  con2 ,-••  and 

codl,cod2,---  are  the  frequencies  at  which  the  sign  of  aT  changes.    Starting  at  co  =  0+ , 


Tn  6 


lim  aT  -  +oo  when  1  <  — -— -ae  <co,  and  as  the  frequency  increases  each  coni  and  each 


<a->0 


T,   T0 


codi  represent  a  frequency  at  which  the  sign  of  aT  changes.  The  relative  ordering  of  cot 


and  codi  is  given  by  (F.2b).  Therefore,  when  1  <— -— -ae  <oo,  aT  is  positive  for 


T,   T0 


(o^^T,a-=(0,o)nl)^(o)dl,o)n2)u(cod2,con3)u--- 
where  the  relative  ordering  of  coni  and  codi  is  now  given  by  (F.2a).   For  the  case  where 


Tn  6 


——a0<\,   lim  aT  =  -oo  ,  and  again  the  sign  changes  at  coni  and  codi  as  the  frequency 


t,  tq 


co^>0 


increases.  Therefore,  when  — -—a0  <  1 ,  aT  is  positive  for 

T,   T0 


o)EQTy=(codl,Q}nl)u(o)d2,o)n2)(j(Q)d3,a)n3)u- 
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These  frequency  ranges  for  which  at  is  positive  exclude  the  end  points,  because  at  a>ni 
the  numerator  of  (7.12b)  is  zero  and  aK  =  0,  and  at  codi  the  denominator  is  zero  and 
aK  =  ±qo  ,  and  therefore  aT  is  not  strictly  positive  at  these  points. 

Finally,  the  frequencies  for  which  both  aK  and  aT  are  both  positive  is  given  by 


a=Q,„nQ.„   for  \<^—^-aa  <oo  and  Q,  =Q„nQ,   for  ^—±aa<\.    Which 


'K,a  r,a 


TJ     T0 


'Kb'    '""di 


Tj   *0 


becomes  all 


where 


co  eQ]  uQ2uQ3u- 


Q,:= 


(0,fi>Bl)      if    l<I±^a0<co 


*I   To 


(codl,conl)  otherwise 


Q, 


(ood2,ooni)    if    l<I±^ae< 
(cod3,con3)  otherwise 


00 


Q3:  = 


(&**>&*)    if     l<——ae<co 


(^dS^ns) 


*1    *"0 

otherwise 


completing  the  proof. 


Q.E.D. 


APPENDIX  E 
PROOF  OF  THEOREM  7.2 

Proof.  Based  on  the  arguments  in  the  brief  proof  following  the  statement  of  the 
theorem,  it  is  sufficient  to  show  that  the  curve  obtained  from  the  first  frequency  range  Q, 
of  Lemma  7.3  gives  the  smallest  value  of  aK  for  each  value  of  aT .  This  is  done  in  two 
steps.  First  it  is  shown  that  there  is  at  least  one  value  of  aT  for  which  Q,  gives  the 
smallest  value  of  aK .  Second,  due  to  the  continuity  of  the  curves  over  each  frequency 
interval,  the  only  way  a  curve  of  a  different  frequency  range  Qx  can  give  a  lower  value 
of  aK  for  some  other  value  of  aT  is  if  the  curve  intersects  the  curve  obtained  from  Q, . 
It  is  then  shown  that  the  curves  never  intersect,  such  that  a  different  frequency  range  Qx 
cannot  give  a  lower  value  of  aK . 

71  5    71  9    71 

First,  consider  cox  =—zr~,  o)2  =—  ■=-,  coi  ~— zz~,  etc.  From  (F.2)  it  is  easy  to 
2  ae  2  ae  2  ae 

T     0 

show  that  cox  gQp  co2  gQ2  ,  col  eQ3,  independent  of  the  value  of  — -— -ae.  Now  from 


*"/  r0 


(7.12)  these  frequencies  yield 


aT(cox)  =  ar(eo2)  =  aT(o)3)  =•••= 


r0 


\Ti  J 


and 


t     \     *  n 

M®.)  =  2 T~e — 
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5  n 


M®2>  =  T     n 


9  71 

M*>3)  =  x r-2 — 


such  that  aK{cox)  <aK(a>2)  <aK(co3)  <••-.    Therefore,  for  ar  =   — -      ,  Q,  gives  the 

\Ti ) 

lowest  value  of  aK . 

To  show  that  the  curves  never  intersect  it  must  be  shown  that  for  every  frequency 

cox  eQ,  there  does  not  exist  a  frequency  cox  eQx  such  that 

aK{0Dy)  =  aK{(Ox)  (E.la) 

and 

aT{cox)  =  aT(cox)  (E.lb) 

To  notationally  simplify  the  problem  consider  the  frequency  scaling  co  ->  — ,  then  from 

ag 

(7.12a)  and  (7.12b)  equations  (E.la)  and  (E.lb)  become 

m 

(E.2a) 


wx  6>x 


cox  cos(&> , ) - y  sin(cox)     cox  cos{cox )-y  s\n{o)x) 
ycos(col)  +  a)l  sin(cox)  y  cos(co  x)  +  co  x  sin(a;t) 


(cox  cos(cox ) - y  sin(cox))cox      [cox  cos{cox)- y  sin(ti^  j)cax 


(E.2b) 


r    9 
where  the  only  free  parameter  is  y  =——ag.    By  combining  (E.2a)  and  (E.2b)  the 


r,  To 


equations  simplify  to 
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CO,  CO 

(E.3a) 


cox  cos(<y,)-7sin(<y,)     co  x  cos(co  x)  -  y  sin(co  x) 

^ = 22 (E.3b) 

y  cos(&> , )  +  co ,  sin(&>, )     y  cos(<y x  )  +  &>.,.  sin(<y x ) 

So,  to  prove  there  is  no  solution  <y,  and  cox  to  (E.la)  and  (E.lb)  it  must  be  shown  that 
there  is  no  solution  to  (E.3a)  and  (E.3b)  for  all  y  >  0.  To  prove  by  contradiction,  assume 
that  there  exists  an  cox  >  0,  cox  >col ,  and  y  >  0  such  that  (E.3a)  and  (E.3b)  hold.  Since 
cox  and  cox  are  considered  to  be  in  the  intervals  given  by  Theorem  7.2,  the  denominators 
of  (E.3a)  and  (E.3b)  are  nonzero  such  that  equations  (E.3a)  and  (E.3b)  become 

fK (cox , oo x , y):=  coxcox  cos{cox )  -  yco ,  sin(<»x )  -  co xcox  cos(&>, )  +  ycox  sin(co , )  =  0  (E.4a) 

/r(©pfi>^,y):=^wf  cos(fi)x)  +  fflffi>xsin(^)-^»^cos(fl)1)-fl)1fflJsin(fi)1)  =  0(E.4b) 
Now,  equations  (E.4a)  and  (E.4b)  hold  simultaneously  if  and  only  if 

fKr(°)^oox,y):=f^(cox,oox,y)  +  fT2((Dl,cox,y)  =  0 
The  function  fKv(oox,cox,y)  is  obviously  a  binomial  in  y  of  the  form 

fKT(ool,cox,y)  =  fy2(oox,cox)y2+f/(o)l,cox)y+fr0(o)i,oox)  (E.5) 

where  the  functions  f2(oox,cox),  f  ,(oox,cox),  and  f  0(cox,cox)  are  simply  products  of 

powers  of  cox  and  wx  and  sin's  and  cos's  of  cox  and  cox .  For  there  to  be  a  positive  real 
solution  y  to  (E.5)  the  discriminant  of  fKT(oox,oox,y)  must  be  non-negative.  The 
discriminant  is 

and  after  much  algebraic  and  trigonometric  manipulation  fdiscr{oox,cox)  is  given  by 
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fdiscr(a>X>MX)  =  Yjf<'iscrACO\,0)x) 


/=0 


where 

flnscr,i(0)x,(ox)  =  -32(\-cos((ox  -cox))2co\co1x{cox  -cox)x 

/(/,Scr,2(£yn^v)  =  -16((1-cos(Wx-fyl))  +  (1-cos(«i)cos(^J)) 
x(l-cos(a>t  - co {))o)\co2x{(o x  -cox)2 

/^c^(^M^x)  =  -16(1-COS(£yi)COS(^x))(1-COS(£Ux-^l))^l£y^(^t-£yi)3 

which  is  always  non-positive,  because  <y,>0,  cox  >  cox ,  cos(<y^ -<y,)  <  1,  and 
cos(<y,)cos(<3^)  <  1 ,  and  is  only  zero  when  <x>x=7m  and  (Dx=G)x+2mn  where 
n  =  \,2,---  and  w  =  l,2,---.  Therefore,  there  is  a  real  solution  p  to  (E.5)  only  when 
a>x=7m   and  cox=  cox+2Tcm   where  n  =  l,2,---    and  w  =  l,2,---.     But  from  (E.3),  for 

cox=7ui  and  &>x  =<y,  +  2mn  we  have  (<y,)  =  (&>,+2;z7w)  which  is  only  true  when 
m  =  0  implying  o)x=cox,  a  contradiction.  Therefore,  there  is  no  solution  cax>0, 
cox>  cox,  and  y  >  0  such  to  (E.la)  and  (E.lb).  As  such,  the  curves  never  intersect,  and 
therefore  the  first  frequency  interval  gives  the  lowest  curve.  Q.E.D. 


APPENDIX  F 
SIGN  CHANGES  IN  EQUATIONS  (7.12A)  AND  (7.12B) 

The  set  of  frequencies  that  give  aK>0  and  aT>0  is  identified  by  those 
frequencies  at  which  the  sign  of  aK  and  aT  change.  In  addition,  only  positive 
frequencies  are  considered  because  for  <z>  =  0,  it  is  known  that  0^=0  and  aT  is 
arbitrary.  The  sign  of  aK  changes  when  the  sign  of  the  numerator  or  the  denominator  of 
(7.12a)  changes  and  the  sign  of  ax  changes  when  the  sign  of  the  numerator  or  the 
denominator  of  (7.12b)  changes.  For  a>>0  the  sign  of  the  numerator  of  aK  never 
changes,  and  because  the  numerator  of  aT  is  continuous  and  differentiable  the  sign  of  the 
numerator  of  aT  changes  at  those  frequencies  for  which  the  function  f„(co)  given  by 
(7.13)  of  Theorem  7.2  satisfies 

/.(<»)  =  0 
provided 

dfJco)         rn  9n        .  ,  .  ,        .  .        v     _ 

^Ji :=  — -—a0sm(aea))  +  sm(a6co)  +  Q)a0cos(aeco)  *0 

dco  tj  r0 

For  G>>Q,  the  denominators  of  aK  and  aT  change  sign  at  those  frequencies  for  which 

fd(co)  given  by  (7.14)  of  Theorem  7.2  satisfies 

/„(*>)  =  0 

provided 

dfAco)  ,       .  .   .        .     r0  #0  ,        .     _ 

-^ :=  cos(a eco) - coa 0 sm(a 0Q)) — -— -a0cos(a0co)  *  0 

dco  Tj  z0 
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because  the  denominators  of  aK  and  aT  are  also  continuous  and  differentiable.    It  can 

now  be  shown  that  fn(co)  and  — cannot  simultaneously  be  zero,  and  likewise  for 

dco 

fd(co)  and  — .    First,  the  positive  zeros  of  fn{co)  given  by  conX,con2,---  and  the 

dco 

positive  zeros  of  fd (co)  given  by  oodx ,  ood2 ,-•■  are  characterized.  Consider  the  frequency 

* 

scaling  co  =  — ,  the  positive  zeros  fn(co)  and  fd(oo)  are  given  by  the  positive  solutions 
ae 

to 

001(0)')=  r~f  (F.la) 

and 

* 

tan(ffl)  =  — ^ (F.lb) 

respectively.  These  are  simply  the  intersections  of  the  cotangent  curve  with  a  negative- 
slope  line  and  the  tangent  curve  with  a  positive-slope  line.  Let  ojmnl,oo'n2,---  be  the 
positive  solutions  to  (F.la).    For  positive  values  of  co*  the  cotangent  curve  is  negative 

only  when  co*  6  ( —  ,#)  U  ( —  ,2/r)u-  •  • .  Also,  in  each  of  these  ranges  the  cotangent  curve 

goes  from  0  to  -oo ,  therefore  the  intersections  with  a  negative  sloped  line  must  occur  at 

1  3  .      „ 

-n<conX<n,    -n<con2<2n,    ••• 
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*  * 


Let  codl,a>d2,---  be  the  positive  solutions  to  (F.lb),  arranged  in  increasing  order.    The 

7t  37X 

tangent  curve  is  positive  only  when  co*  e(0,— )u(/r, — )u---.    Also,  in  each  of  these 

ranges  the  tangent  curve  goes  from  0  to  co .  When,  0  <  — -— -aB  <  1  the  slope  of  the  line 

that  intersects  the  tangent  curve  is  greater  than  1 ,  therefore, 

.        .       1  .       3 

0<a)<n<-K,     K<(Od2<-K,     ••• 

T     6 

If  1  <  — -— -ae<<x>  then  the  slope  of  the  line  that  intersects  the  tangent  curve  is  less  than 

.  7X  .  .  . 

1  and  in  the  interval  (0,—)  there  is  no  intersection  with  the  tangent  curve,  therefore 

.       3  .5 

TX<COdx<-7X,      2K<(Od5<-K,      ••• 

Therefore,  if  conX,con2,---   are  the  positive  zeros  of  f„{co)   and  codl,cod2,---   are  the 

t   Q 
positive  zeros  of  fd(co)  arranged  in  increasing  order,  then  for  0  <  — -— -a0  <  1  it  follows 


*"/  To 


that 


\    7t  7X  3    IX  _   K  ,~  ~   x 

0<^i  <T— <^n.  <  —  <cod2  <-—<con2<2—<--  (F.2a) 

2ae  ag  2  ag  a6 


and  for  1  <  — -— -ae  <  oo 

\   7X                it                3  n  n                5  7X                  ^..v 

2  a9              ae              2  ae  ae               2  ad 

Similarly,  the  positive  zeros  of  and  — are  given  by  the  positive 

dco  dco 

solutions  to 
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Z-Z-a.-l 


cot(<y*)  = 


*"/   To 


CO 


and 


tan(&>*)  = 


CO* 


respectively.   Hence,  any  positive  scaled  frequency  co*  at  which  fn(co)  and  — are 

dco 

simultaneously  zero  must  satisfy 


-co 


aa-\ 


T0   ^0 


a< 


69 


or 


<2) 


or. 


(F.3) 


Also,  any  positive  scaled  frequency  co*  at  which  fd{co)  and  — are  simultaneously 

dco 

zero  must  satisfy 


co 


a, 


which  also  reduces  to  (F.3).    Finally,  the  frequencies  given  by  (F.3)  must  satisfy  (F.la) 
for  f„(co)  to  be  zero  and  (F.lb)  for  fd(co)  to  be  zero.  Substitution  gives 
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cot 


a, 


*1   To 


*1    *0 


a, 


I±?JL 
*1  *o 


a, 


(F.4a) 


and 


tan 


a, 


(F.4b) 


respectively.   A  frequency  given  by  (F.3)  is  real  only  when  — -— -ae  <  1 ,  and  it  is  easily 

T     6 

verified  that  there  is  no  solution  — -— -ae   strictly  less  than  1  to  (F.4a)  or  (F.4b). 

Therefore,  there  is  no  frequency  at  which  /„(&)  and  — are  simultaneously  zero, 

dco 

and  there  is  no  frequency  at  which  fd{co)  and  — are  simultaneously  zero,  implying 

dco 

that  the  sign  of  the  numerator  of  aT  changes  at  the  frequencies  conX ,  con2 ,  ■  •  • ,  and  the  sign 


of  the  denominators  of  aK  and  aT  changes  at  the  frequencies  codl,  cod2,  ••• . 
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